Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method

Diakonikolas, Jelena; Orecchia, Lorenzo

doi:10.4230/LIPIcs.ITCS.2018.23

File

Cite AsGet BibTex

Jelena Diakonikolas and Lorenzo Orecchia. Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method. In 9th Innovations in Theoretical Computer Science Conference (ITCS 2018). Leibniz International Proceedings in Informatics (LIPIcs), Volume 94, pp. 23:1-23:19, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/LIPIcs.ITCS.2018.23

Abstract

We provide a novel accelerated first-order method that achieves the asymptotically optimal convergence rate for smooth functions in the first-order oracle model. To this day, Nesterov's Accelerated Gradient Descent (AGD) and variations thereof were the only methods achieving acceleration in this standard blackbox model. In contrast, our algorithm is significantly different from AGD, as it relies on a predictor-corrector approach similar to that used by Mirror-Prox [Nemirovski, 2004] and Extra-Gradient Descent [Korpelevich, 1977] in the solution of convex-concave saddle point problems. For this reason, we dub our algorithm Accelerated Extra-Gradient Descent (AXGD). Its construction is motivated by the discretization of an accelerated continuous-time dynamics [Krichene et al., 2015] using the classical method of implicit Euler discretization. Our analysis explicitly shows the effects of discretization through a conceptually novel primal-dual viewpoint. Moreover, we show that the method is quite general: it attains optimal convergence rates for other classes of objectives (e.g., those with generalized smoothness properties or that are non-smooth and Lipschitz-continuous) using the appropriate choices of step lengths. Finally, we present experiments showing that our algorithm matches the performance of Nesterov's method, while appearing more robust to noise in some cases.

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Zeyuan Allen-Zhu and Lorenzo Orecchia. Linear coupling: An ultimate unification of gradient and mirror descent. In Proc. ITCS'17, 2017.
Raef Bassily, Adam Smith, and Abhradeep Thakurta. Private empirical risk minimization: Efficient algorithms and tight error bounds. In Proc. IEEE FOCS'14, 2014.
Aharon Ben-Tal and Arkadi Nemirovski. Lectures on modern convex optimization: Analysis, algorithms, and engineering applications. MPS-SIAM Series on Optimization. SIAM, 2001.
Sébastien Bubeck. Theory of convex optimization for machine learning. CoRR, abs/1405.4980, 2014. URL: http://arxiv.org/abs/1405.4980v1.
Sébastien Bubeck, Yin Tat Lee, and Mohit Singh. A geometric alternative to nesterov’s accelerated gradient descent. CoRR, abs/1506.08187, 2015. URL: http://arxiv.org/abs/1506.08187.
Olivier Devolder, François Glineur, and Yurii Nesterov. First-order methods of smooth convex optimization with inexact oracle. Math. Prog., 146(1-2):37-75, 2014.
Jelena Diakonikolas and Lorenzo Orecchia. The approximate gap technique: A unified approach to optimal first-order methods, 2017. Manuscript.
A. Ene and H. L. Nguyen. Constrained submodular maximization: Beyond 1/e. In Proc. IEEE FOCS'16, 2016.
E Hairer, SP Nørsett, and G Wanner. Solving Ordinary Differential Equations I (2nd Revised. Ed.): Nonstiff Problems. Springer Ser. Comput. Math. Springer-Verlag New York, Inc., 1993.
Moritz Hardt. Robustness vs acceleration, 2014. URL: http://blog.mrtz.org/2014/08/18/robustness-versus-acceleration.html.
Rahul Jain, Zhengfeng Ji, Sarvagya Upadhyay, and John Watrous. QIP = PSPACE. Journal of the ACM (JACM), 58(6):30, 2011.
Jonathan A. Kelner, Yin Tat Lee, Lorenzo Orecchia, and Aaron Sidford. An almost-linear-time algorithm for approximate max flow in undirected graphs, and its multicommodity generalizations. In Proc. ACM-SIAM SODA'14, 2014.
Jonathan A. Kelner, Lorenzo Orecchia, Aaron Sidford, and Zeyuan Allen Zhu. A simple, combinatorial algorithm for solving SDD systems in nearly-Linear time. In Proc. ACM STOC'13, 2013.
G. M. Korpelevich. The extragradient method for finding saddle points and other problems. Matekon : translations of Russian &East European mathematical economics, 13(4):35-49, 1977.
Walid Krichene, Alexandre Bayen, and Peter L Bartlett. Accelerated mirror descent in continuous and discrete time. In Proc. NIPS'15, 2015.
Guanghui Lan. An optimal method for stochastic composite optimization. Math. Prog., 133(1-2):365-397, January 2011.
Yin Tat Lee, Satish Rao, and Nikhil Srivastava. A new approach to computing maximum flows using electrical flows. In Proc. ACM STOC '13, 2013.
Arkadi Nemirovski. Prox-method with rate of convergence O(1/t) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems. SIAM J. Optimiz., 15(1):229-251, 2004.
Arkadi S Nemirovski and Yurii Evgen'evich Nesterov. Optimal methods of smooth convex minimization. Zh. Vychisl. Mat. i Mat. Fiz., 25(3):356-369, 1985.
Arkadii Nemirovskii and David Borisovich Yudin. Problem complexity and method efficiency in optimization. Wiley, 1983.
Yu Nesterov. Universal gradient methods for convex optimization problems. Math. Prog., 152(1-2):381-404, 2015.
Yurii Nesterov. A method of solving a convex programming problem with convergence rate O(1/k²). In Doklady AN SSSR (translated as Soviet Mathematics Doklady), volume 269, pages 543-547, 1983.
Yurii Nesterov. Introductory Lectures on Convex Programming Volume: A Basic course, volume I. Kluwer Academic Publishers, 2004.
Yurii Nesterov. Excessive gap technique in nonsmooth convex minimization. SIAM J. Optimiz., 16(1):235-249, January 2005.
Yurii Nesterov. Smooth minimization of non-smooth functions. Math. Prog., 103(1):127-152, December 2005.
Yurii Nesterov. Accelerating the cubic regularization of Newton’s method on convex problems. Math. Prog., 112(1):159-181, 2008.
Yurii Nesterov. Gradient methods for minimizing composite functions. Math. Prog., 140(1):125-161, 2013.
Yurii Nesterov. Introductory lectures on convex optimization: A basic course. Springer Science &Business Media, 2013.
Jonah Sherman. Nearly maximum flows in nearly linear time. In Proc. IEEE FOCS'13, 2013.
Daniel A. Spielman and Shang-Hua Teng. Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems. In Proc. ACM STOC '04, 2004.
Weijie Su, Stephen Boyd, and Emmanuel Candes. A differential equation for modeling nesterov’s accelerated gradient method: Theory and insights. In Proc. NIPS'14, 2014.
Paul Tseng. On accelerated proximal gradient methods for convex-concave optimization, 2008.
Andre Wibisono, Ashia C Wilson, and Michael I Jordan. A variational perspective on accelerated methods in optimization. In Proc. Natl. Acad. Sci. U.S.A., 2016.

Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method

Authors Jelena Diakonikolas, Lorenzo Orecchia

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Keywords

Metrics

References

Thanks for your feedback!

Could not send message