Minor bug fix: changing train_step in examples code to take a mean of the stats instead of taking from the first device. Because the optimizer syncs its own stats (like loss), this didn't matter except for stats returned from the kfac_jax optimizer (or Optax optimizers using OptaxWrapper). However, the Polyak averaged loss wasn't actually synced across devices (as its not part of the optimizer anymore), so "loss_polyak" was being reported only for the first device. #304

copybara-service · 2024-11-27T19:45:35Z

Minor bug fix: changing train_step in examples code to take a mean of the stats instead of taking from the first device. Because the optimizer syncs its own stats (like loss), this didn't matter except for stats returned from the kfac_jax optimizer (or Optax optimizers using OptaxWrapper). However, the Polyak averaged loss wasn't actually synced across devices (as its not part of the optimizer anymore), so "loss_polyak" was being reported only for the first device.

copybara-service bot force-pushed the test_700710272 branch 2 times, most recently from f2d9c01 to 40be744 Compare November 29, 2024 18:41

copybara-service bot closed this Nov 29, 2024

copybara-service bot force-pushed the test_700710272 branch from 40be744 to face046 Compare November 29, 2024 18:47

copybara-service bot deleted the test_700710272 branch November 29, 2024 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copybara-service bot commented Nov 27, 2024

Conversation

copybara-service bot commented Nov 27, 2024