Synchronize updates; fix AdamW lr_t (keras)

OverLordGoldDragon released this 04 Jun 02:20

· 9 commits to master since this release

8d362e2

BUGFIXES:

Last weight in network would be updated with t_cur one update ahead, desynchronizing it from all other weights
AdamW in keras (optimizers.py, optimizers_225.py) weight updates were not mediated by eta_t, so cosine annealing had no effect.

FEATURES:

Added lr_t to tf.keras optimizers to track "actual" learning rate externally; use K.eval(model.optimizer.lr_t) to get "actual" learning rate for given t_cur and iterations
Added lr_t vs. iterations plot to README, and source code in example.py

MISC:

Added test_updates to ensure all weights update synchronously, and that eta_t first applies on weights as-is and then updates according to t_cur
Fixes #47

Assets 2