Skip to content

Latest commit

 

History

History
54 lines (54 loc) · 2.16 KB

2022-06-28-antonakopoulos22a.md

File metadata and controls

54 lines (54 loc) · 2.16 KB
title booktitle abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
AdaGrad Avoids Saddle Points
Proceedings of the 39th International Conference on Machine Learning
Adaptive first-order methods in optimization have widespread ML applications due to their ability to adapt to non-convex landscapes. However, their convergence guarantees are typically stated in terms of vanishing gradient norms, which leaves open the issue of converging to undesirable saddle points (or even local maxima). In this paper, we focus on the AdaGrad family of algorithms - from scalar to full-matrix preconditioning - and we examine the question of whether the method’s trajectories avoid saddle points. A major challenge that arises here is that AdaGrad’s step-size (or, more accurately, the method’s preconditioner) evolves over time in a filtration-dependent way, i.e., as a function of all gradients observed in earlier iterations; as a result, avoidance results for methods with a constant or vanishing step-size do not apply. We resolve this challenge by combining a series of step-size stabilization arguments with a recursive representation of the AdaGrad preconditioner that allows us to employ center-stable techniques and ultimately show that the induced trajectories avoid saddle points from almost any initial condition.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
antonakopoulos22a
0
{A}da{G}rad Avoids Saddle Points
731
771
731-771
731
false
Antonakopoulos, Kimon and Mertikopoulos, Panayotis and Piliouras, Georgios and Wang, Xiao
given family
Kimon
Antonakopoulos
given family
Panayotis
Mertikopoulos
given family
Georgios
Piliouras
given family
Xiao
Wang
2022-06-28
Proceedings of the 39th International Conference on Machine Learning
162
inproceedings
date-parts
2022
6
28