title

booktitle

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD

Proceedings of the 39th International Conference on Machine Learning

Concurrent algorithmic implementations of Stochastic Gradient Descent (SGD) give rise to critical questions for compute-intensive Machine Learning (ML). Asynchrony implies speedup in some contexts, and challenges in others, as stale updates may lead to slower, or non-converging executions. While previous works showed asynchrony-adaptiveness can improve stability and speedup by reducing the step size for stale updates according to static rules, there is no one-size-fits-all adaptation rule, since the optimal strategy depends on several factors. We introduce (i) $\mathtt{ASAP.SGD}$, an analytical framework capturing necessary and desired properties of staleness-adaptive step size functions and (ii) \textsc{tail}-$\tau$, a method for utilizing key properties of the execution instance, generating a tailored strategy that not only dampens the impact of stale updates, but also leverages fresh ones. We recover convergence bounds for adaptiveness functions satisfying the $\mathtt{ASAP.SGD}$ conditions for general, convex and non-convex problems, and establish novel bounds for ones satisfying the Polyak-Lojasiewicz property. We evaluate \textsc{tail}-$\tau$ with representative AsyncSGD concurrent algorithms, for Deep Learning problems, showing \textsc{tail}-$\tau$ is a vital complement to AsyncSGD, with (i) persistent speedup in wall-clock convergence time in the parallelism spectrum, (ii) considerably lower risk of non-convergence, as well as (iii) precision levels for which original SGD implementations fail.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

backstrom22a

0

{ASAP}.{SGD}: Instance-based Adaptiveness to Staleness in Asynchronous {SGD}

1261

1276

1261-1276

1261

false

B{\"a}ckstr{\"o}m, Karl and Papatriantafilou, Marina and Tsigas, Philippas

given	family
Karl	Bäckström

given	family
Marina	Papatriantafilou

given	family
Philippas	Tsigas

2022-06-28

Proceedings of the 39th International Conference on Machine Learning

162

inproceedings

date-parts

2022

6

28

https://proceedings.mlr.press/v162/backstrom22a/backstrom22a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-06-28-backstrom22a.md

2022-06-28-backstrom22a.md

Files

2022-06-28-backstrom22a.md

Latest commit

History

2022-06-28-backstrom22a.md

File metadata and controls