Skip to content

Releases: warner-benjamin/optimi

v0.2.1: param_groups_weight_decay

06 Jun 00:25
Compare
Choose a tag to compare

Adds param_groups_weight_decay which excludes bias and normalization layers from weight decay. param_groups_weight_decay is lightly modified from PyTorch Image Models.

Full Changelog: v0.2.0...v0.2.1

v0.2.0: Gradient Release & Optimizer Accumulation

11 Mar 04:13
Compare
Choose a tag to compare
  • Add Gradient Release
  • Add Optimizer Accumulation

Full Changelog: v0.1.2...v0.2.0

v0.1.2

18 Dec 16:05
Compare
Choose a tag to compare

Add RAdam and Ranger optimizers.

Full Changelog: v0.1.1...v0.1.2

v0.1.1: Initial Release

19 Nov 01:56
Compare
Choose a tag to compare

optimī

Fast, Modern, and Low Precision PyTorch Optimizers

optimi enables accurate low precision training via Kahan summation, supports fully decoupled weight decay, and features fast implementations of modern optimizers.

Low Precision Training with Kahan Summation

optimi optimizers can match the performance of mixed precision when training in BFloat16 by using Kahan summation.

Training in BFloat16 with Kahan summation can reduce non-activation training memory usage by 37.5 to 45.5 percent when using an Adam optimizer. BFloat16 training increases single GPU training speed by ~10 percent at the same batch size.

Fully Decoupled Weight Decay

In addition to supporting PyTorch-style decoupled weight decay, optimi optimizers also support fully decoupled weight decay.

Fully decoupled weight decay decouples weight decay from the learning rate, more accurately following Decoupled Weight Decay Regularization. This can help simplify hyperparameter tuning as the optimal weight decay is no longer tied to the learning rate.

Foreach Implementations

All optimi optimizers have fast foreach implementations, which can significantly outperform the for-loop versions. optimi reuses the gradient buffer for temporary variables to reduce foreach memory usage.

Documentation

https://optimi.benjaminwarner.dev

Install

optimi is available to install from pypi.

pip install torch-optimi

Optimizers

optimi implements the following optimizers: