title | booktitle | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ASAP.SGD: Instance-based Adaptiveness to Staleness in Asynchronous SGD |
Proceedings of the 39th International Conference on Machine Learning |
Concurrent algorithmic implementations of Stochastic Gradient Descent (SGD) give rise to critical questions for compute-intensive Machine Learning (ML). Asynchrony implies speedup in some contexts, and challenges in others, as stale updates may lead to slower, or non-converging executions. While previous works showed asynchrony-adaptiveness can improve stability and speedup by reducing the step size for stale updates according to static rules, there is no one-size-fits-all adaptation rule, since the optimal strategy depends on several factors. We introduce (i) $\mathtt{ASAP.SGD}$, an analytical framework capturing necessary and desired properties of staleness-adaptive step size functions and (ii) \textsc{tail}-$\tau$, a method for utilizing key properties of the <em>execution instance</em>, generating a tailored strategy that not only dampens the impact of stale updates, but also leverages fresh ones. We recover convergence bounds for adaptiveness functions satisfying the |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
backstrom22a |
0 |
{ASAP}.{SGD}: Instance-based Adaptiveness to Staleness in Asynchronous {SGD} |
1261 |
1276 |
1261-1276 |
1261 |
false |
B{\"a}ckstr{\"o}m, Karl and Papatriantafilou, Marina and Tsigas, Philippas |
|
2022-06-28 |
Proceedings of the 39th International Conference on Machine Learning |
162 |
inproceedings |
|