Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor bug fix: changing train_step in examples code to take a mean of the stats instead of taking from the first device. Because the optimizer syncs its own stats (like loss), this didn't matter except for stats returned from the kfac_jax optimizer (or Optax optimizers using OptaxWrapper). However, the Polyak averaged loss wasn't actually synced across devices (as its not part of the optimizer anymore), so "loss_polyak" was being reported only for the first device. #304

Closed
wants to merge 0 commits into from

Conversation

copybara-service[bot]
Copy link

Minor bug fix: changing train_step in examples code to take a mean of the stats instead of taking from the first device. Because the optimizer syncs its own stats (like loss), this didn't matter except for stats returned from the kfac_jax optimizer (or Optax optimizers using OptaxWrapper). However, the Polyak averaged loss wasn't actually synced across devices (as its not part of the optimizer anymore), so "loss_polyak" was being reported only for the first device.

@copybara-service copybara-service bot force-pushed the test_700710272 branch 2 times, most recently from f2d9c01 to 40be744 Compare November 29, 2024 18:41
@copybara-service copybara-service bot closed this Nov 29, 2024
@copybara-service copybara-service bot deleted the test_700710272 branch November 29, 2024 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants