Skip to content

Latest commit

 

History

History
57 lines (57 loc) · 2.59 KB

2022-06-28-backstrom22a.md

File metadata and controls

57 lines (57 loc) · 2.59 KB
title booktitle abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD
Proceedings of the 39th International Conference on Machine Learning
Concurrent algorithmic implementations of Stochastic Gradient Descent (SGD) give rise to critical questions for compute-intensive Machine Learning (ML). Asynchrony implies speedup in some contexts, and challenges in others, as stale updates may lead to slower, or non-converging executions. While previous works showed asynchrony-adaptiveness can improve stability and speedup by reducing the step size for stale updates according to static rules, there is no one-size-fits-all adaptation rule, since the optimal strategy depends on several factors. We introduce (i) $\mathtt{ASAP.SGD}$, an analytical framework capturing necessary and desired properties of staleness-adaptive step size functions and (ii) \textsc{tail}-$\tau$, a method for utilizing key properties of the <em>execution instance</em>, generating a tailored strategy that not only dampens the impact of stale updates, but also leverages fresh ones. We recover convergence bounds for adaptiveness functions satisfying the $\mathtt{ASAP.SGD}$ conditions for general, convex and non-convex problems, and establish novel bounds for ones satisfying the Polyak-Lojasiewicz property. We evaluate \textsc{tail}-$\tau$ with representative <em>AsyncSGD</em> concurrent algorithms, for Deep Learning problems, showing \textsc{tail}-$\tau$ is a vital complement to <em>AsyncSGD</em>, with (i) persistent speedup in wall-clock convergence time in the parallelism spectrum, (ii) considerably lower risk of non-convergence, as well as (iii) precision levels for which original SGD implementations fail.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
backstrom22a
0
{ASAP}.{SGD}: Instance-based Adaptiveness to Staleness in Asynchronous {SGD}
1261
1276
1261-1276
1261
false
B{\"a}ckstr{\"o}m, Karl and Papatriantafilou, Marina and Tsigas, Philippas
given family
Karl
Bäckström
given family
Marina
Papatriantafilou
given family
Philippas
Tsigas
2022-06-28
Proceedings of the 39th International Conference on Machine Learning
162
inproceedings
date-parts
2022
6
28