Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix batching envs with non tensor data #2674

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Dec 20, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2674

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 6 Unrelated Failures

As of commit 5ec14bb with merge base 133d709 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2024
vmoens added a commit that referenced this pull request Dec 20, 2024
ghstack-source-id: daba8a95459cfa978da09291757b6380fab4f308
Pull Request resolved: #2674
@vmoens vmoens added the bug Something isn't working label Dec 20, 2024
@vmoens vmoens merged commit 5ec14bb into gh/vmoens/64/base Dec 20, 2024
56 of 60 checks passed
vmoens added a commit that referenced this pull request Dec 20, 2024
ghstack-source-id: daba8a95459cfa978da09291757b6380fab4f308
Pull Request resolved: #2674
@vmoens vmoens deleted the gh/vmoens/64/head branch December 20, 2024 10:27
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4349s 0.4313s 2.3188 Ops/s 2.2279 Ops/s $\color{#35bf28}+4.08\%$
test_transformed 0.6118s 0.6072s 1.6470 Ops/s 1.6344 Ops/s $\color{#35bf28}+0.77\%$
test_serial 1.3662s 1.3610s 0.7347 Ops/s 0.7286 Ops/s $\color{#35bf28}+0.85\%$
test_parallel 1.2929s 1.2097s 0.8267 Ops/s 0.8226 Ops/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-True-True-True-True] 0.3472ms 31.0877μs 32.1671 KOps/s 32.4369 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[True-True-True-True-False] 63.4390μs 18.1450μs 55.1116 KOps/s 54.3838 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[True-True-True-False-True] 49.2030μs 17.7613μs 56.3022 KOps/s 56.7291 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[True-True-True-False-False] 34.6350μs 10.1665μs 98.3620 KOps/s 96.0031 KOps/s $\color{#35bf28}+2.46\%$
test_step_mdp_speed[True-True-False-True-True] 73.0570μs 33.2830μs 30.0453 KOps/s 30.2628 KOps/s $\color{#d91a1a}-0.72\%$
test_step_mdp_speed[True-True-False-True-False] 79.0380μs 20.2234μs 49.4476 KOps/s 49.2321 KOps/s $\color{#35bf28}+0.44\%$
test_step_mdp_speed[True-True-False-False-True] 51.7070μs 19.7212μs 50.7070 KOps/s 50.8715 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-False-False-False] 0.1243ms 12.5867μs 79.4491 KOps/s 81.1995 KOps/s $\color{#d91a1a}-2.16\%$
test_step_mdp_speed[True-False-True-True-True] 77.4750μs 35.3087μs 28.3216 KOps/s 28.6532 KOps/s $\color{#d91a1a}-1.16\%$
test_step_mdp_speed[True-False-True-True-False] 61.0450μs 22.2270μs 44.9903 KOps/s 45.4230 KOps/s $\color{#d91a1a}-0.95\%$
test_step_mdp_speed[True-False-True-False-True] 79.2180μs 19.7864μs 50.5398 KOps/s 51.4103 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[True-False-True-False-False] 52.0690μs 12.2602μs 81.5646 KOps/s 81.1970 KOps/s $\color{#35bf28}+0.45\%$
test_step_mdp_speed[True-False-False-True-True] 0.1405ms 36.9711μs 27.0481 KOps/s 27.0664 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-False-False-True-False] 65.3630μs 23.9571μs 41.7413 KOps/s 41.6041 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-False-False-True] 66.1240μs 21.7273μs 46.0250 KOps/s 47.1336 KOps/s $\color{#d91a1a}-2.35\%$
test_step_mdp_speed[True-False-False-False-False] 65.0520μs 14.2643μs 70.1052 KOps/s 69.9806 KOps/s $\color{#35bf28}+0.18\%$
test_step_mdp_speed[False-True-True-True-True] 0.2986ms 35.5196μs 28.1534 KOps/s 28.1362 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[False-True-True-True-False] 56.3560μs 22.5691μs 44.3083 KOps/s 45.0305 KOps/s $\color{#d91a1a}-1.60\%$
test_step_mdp_speed[False-True-True-False-True] 75.4510μs 22.9008μs 43.6666 KOps/s 44.7583 KOps/s $\color{#d91a1a}-2.44\%$
test_step_mdp_speed[False-True-True-False-False] 49.1120μs 13.7691μs 72.6264 KOps/s 72.4497 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[False-True-False-True-True] 92.7040μs 37.2023μs 26.8801 KOps/s 27.2005 KOps/s $\color{#d91a1a}-1.18\%$
test_step_mdp_speed[False-True-False-True-False] 50.4450μs 24.3123μs 41.1315 KOps/s 41.2680 KOps/s $\color{#d91a1a}-0.33\%$
test_step_mdp_speed[False-True-False-False-True] 2.6353ms 24.6850μs 40.5104 KOps/s 41.3159 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[False-True-False-False-False] 65.9330μs 15.5759μs 64.2017 KOps/s 64.3838 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-False-True-True-True] 76.7840μs 39.3407μs 25.4190 KOps/s 25.5507 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[False-False-True-True-False] 78.4160μs 26.3169μs 37.9985 KOps/s 38.3807 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[False-False-True-False-True] 70.4320μs 24.3332μs 41.0961 KOps/s 41.5435 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[False-False-True-False-False] 72.1550μs 15.5896μs 64.1455 KOps/s 64.5227 KOps/s $\color{#d91a1a}-0.58\%$
test_step_mdp_speed[False-False-False-True-True] 0.1059ms 40.7777μs 24.5232 KOps/s 23.7035 KOps/s $\color{#35bf28}+3.46\%$
test_step_mdp_speed[False-False-False-True-False] 56.7970μs 27.8384μs 35.9217 KOps/s 36.0888 KOps/s $\color{#d91a1a}-0.46\%$
test_step_mdp_speed[False-False-False-False-True] 87.5340μs 25.9134μs 38.5900 KOps/s 39.1601 KOps/s $\color{#d91a1a}-1.46\%$
test_step_mdp_speed[False-False-False-False-False] 45.5750μs 17.3616μs 57.5984 KOps/s 57.8370 KOps/s $\color{#d91a1a}-0.41\%$
test_values[generalized_advantage_estimate-True-True] 10.9539ms 10.3476ms 96.6409 Ops/s 103.5377 Ops/s $\textbf{\color{#d91a1a}-6.66\%}$
test_values[vec_generalized_advantage_estimate-True-True] 36.3004ms 33.5522ms 29.8043 Ops/s 29.8189 Ops/s $\color{#d91a1a}-0.05\%$
test_values[td0_return_estimate-False-False] 0.2331ms 0.1865ms 5.3619 KOps/s 5.3377 KOps/s $\color{#35bf28}+0.45\%$
test_values[td1_return_estimate-False-False] 28.9849ms 24.9863ms 40.0219 Ops/s 40.7464 Ops/s $\color{#d91a1a}-1.78\%$
test_values[vec_td1_return_estimate-False-False] 35.4544ms 33.5728ms 29.7861 Ops/s 29.7581 Ops/s $\color{#35bf28}+0.09\%$
test_values[td_lambda_return_estimate-True-False] 46.5988ms 36.3629ms 27.5006 Ops/s 28.1376 Ops/s $\color{#d91a1a}-2.26\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.6230ms 33.6264ms 29.7385 Ops/s 29.7081 Ops/s $\color{#35bf28}+0.10\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 10.0657ms 8.6290ms 115.8877 Ops/s 119.0806 Ops/s $\color{#d91a1a}-2.68\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2001ms 1.8719ms 534.2025 Ops/s 527.6354 Ops/s $\color{#35bf28}+1.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4235ms 0.3573ms 2.7985 KOps/s 2.7935 KOps/s $\color{#35bf28}+0.18\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 39.6970ms 36.3569ms 27.5051 Ops/s 27.1081 Ops/s $\color{#35bf28}+1.46\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.4060ms 3.0572ms 327.0975 Ops/s 328.3469 Ops/s $\color{#d91a1a}-0.38\%$
test_dqn_speed[False-None] 6.0943ms 1.3972ms 715.7184 Ops/s 710.3677 Ops/s $\color{#35bf28}+0.75\%$
test_dqn_speed[False-backward] 2.3149ms 1.8914ms 528.7171 Ops/s 532.2647 Ops/s $\color{#d91a1a}-0.67\%$
test_dqn_speed[True-None] 0.7456ms 0.4763ms 2.0995 KOps/s 2.0471 KOps/s $\color{#35bf28}+2.56\%$
test_dqn_speed[True-backward] 1.0477ms 0.9106ms 1.0981 KOps/s 1.0974 KOps/s $\color{#35bf28}+0.06\%$
test_dqn_speed[reduce-overhead-None] 0.6744ms 0.4828ms 2.0713 KOps/s 2.0510 KOps/s $\color{#35bf28}+0.99\%$
test_dqn_speed[reduce-overhead-backward] 0.9615ms 0.8984ms 1.1131 KOps/s 1.0689 KOps/s $\color{#35bf28}+4.13\%$
test_ddpg_speed[False-None] 3.2956ms 2.9158ms 342.9611 Ops/s 339.8468 Ops/s $\color{#35bf28}+0.92\%$
test_ddpg_speed[False-backward] 4.1581ms 4.0508ms 246.8634 Ops/s 243.3763 Ops/s $\color{#35bf28}+1.43\%$
test_ddpg_speed[True-None] 1.3910ms 1.0291ms 971.7549 Ops/s 971.6227 Ops/s $\color{#35bf28}+0.01\%$
test_ddpg_speed[True-backward] 2.4886ms 2.0175ms 495.6627 Ops/s 516.6502 Ops/s $\color{#d91a1a}-4.06\%$
test_ddpg_speed[reduce-overhead-None] 1.5001ms 1.0262ms 974.4334 Ops/s 968.8607 Ops/s $\color{#35bf28}+0.58\%$
test_ddpg_speed[reduce-overhead-backward] 2.0294ms 1.9308ms 517.9177 Ops/s 516.8650 Ops/s $\color{#35bf28}+0.20\%$
test_sac_speed[False-None] 10.2221ms 8.1433ms 122.7999 Ops/s 121.2533 Ops/s $\color{#35bf28}+1.28\%$
test_sac_speed[False-backward] 11.8661ms 10.9423ms 91.3885 Ops/s 89.3873 Ops/s $\color{#35bf28}+2.24\%$
test_sac_speed[True-None] 2.4033ms 1.8689ms 535.0627 Ops/s 539.8354 Ops/s $\color{#d91a1a}-0.88\%$
test_sac_speed[True-backward] 3.9571ms 3.5859ms 278.8699 Ops/s 283.0303 Ops/s $\color{#d91a1a}-1.47\%$
test_sac_speed[reduce-overhead-None] 2.8095ms 1.9004ms 526.2017 Ops/s 536.6772 Ops/s $\color{#d91a1a}-1.95\%$
test_sac_speed[reduce-overhead-backward] 3.8459ms 3.5844ms 278.9877 Ops/s 283.1425 Ops/s $\color{#d91a1a}-1.47\%$
test_redq_speed[False-None] 14.9544ms 13.2170ms 75.6600 Ops/s 76.3109 Ops/s $\color{#d91a1a}-0.85\%$
test_redq_speed[False-backward] 24.8576ms 22.6849ms 44.0823 Ops/s 44.5558 Ops/s $\color{#d91a1a}-1.06\%$
test_redq_speed[True-None] 6.0679ms 4.6436ms 215.3520 Ops/s 211.0058 Ops/s $\color{#35bf28}+2.06\%$
test_redq_speed[True-backward] 13.0474ms 12.2448ms 81.6676 Ops/s 80.9883 Ops/s $\color{#35bf28}+0.84\%$
test_redq_speed[reduce-overhead-None] 5.4179ms 4.6578ms 214.6926 Ops/s 207.4352 Ops/s $\color{#35bf28}+3.50\%$
test_redq_speed[reduce-overhead-backward] 13.3616ms 12.1756ms 82.1318 Ops/s 78.6890 Ops/s $\color{#35bf28}+4.38\%$
test_redq_deprec_speed[False-None] 15.9056ms 13.0869ms 76.4121 Ops/s 73.3199 Ops/s $\color{#35bf28}+4.22\%$
test_redq_deprec_speed[False-backward] 20.8796ms 19.0952ms 52.3692 Ops/s 51.2973 Ops/s $\color{#35bf28}+2.09\%$
test_redq_deprec_speed[True-None] 4.3152ms 3.6568ms 273.4665 Ops/s 272.5807 Ops/s $\color{#35bf28}+0.32\%$
test_redq_deprec_speed[True-backward] 8.7753ms 8.2599ms 121.0667 Ops/s 118.8640 Ops/s $\color{#35bf28}+1.85\%$
test_redq_deprec_speed[reduce-overhead-None] 4.2221ms 3.6382ms 274.8598 Ops/s 277.2618 Ops/s $\color{#d91a1a}-0.87\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.8404ms 8.3640ms 119.5604 Ops/s 123.3637 Ops/s $\color{#d91a1a}-3.08\%$
test_td3_speed[False-None] 8.5231ms 8.1808ms 122.2375 Ops/s 122.1992 Ops/s $\color{#35bf28}+0.03\%$
test_td3_speed[False-backward] 12.3875ms 10.7094ms 93.3759 Ops/s 94.3373 Ops/s $\color{#d91a1a}-1.02\%$
test_td3_speed[True-None] 1.8960ms 1.7628ms 567.2695 Ops/s 567.0432 Ops/s $\color{#35bf28}+0.04\%$
test_td3_speed[True-backward] 3.4811ms 3.3738ms 296.3998 Ops/s 293.2616 Ops/s $\color{#35bf28}+1.07\%$
test_td3_speed[reduce-overhead-None] 1.9804ms 1.7572ms 569.1026 Ops/s 567.5847 Ops/s $\color{#35bf28}+0.27\%$
test_td3_speed[reduce-overhead-backward] 4.1326ms 3.4143ms 292.8894 Ops/s 282.9396 Ops/s $\color{#35bf28}+3.52\%$
test_cql_speed[False-None] 40.1839ms 36.5406ms 27.3668 Ops/s 26.3466 Ops/s $\color{#35bf28}+3.87\%$
test_cql_speed[False-backward] 57.0550ms 47.3526ms 21.1182 Ops/s 20.9003 Ops/s $\color{#35bf28}+1.04\%$
test_cql_speed[True-None] 16.8292ms 15.8979ms 62.9013 Ops/s 60.0120 Ops/s $\color{#35bf28}+4.81\%$
test_cql_speed[True-backward] 23.8640ms 22.9629ms 43.5486 Ops/s 42.1887 Ops/s $\color{#35bf28}+3.22\%$
test_cql_speed[reduce-overhead-None] 16.6252ms 15.8922ms 62.9238 Ops/s 61.0403 Ops/s $\color{#35bf28}+3.09\%$
test_cql_speed[reduce-overhead-backward] 24.6388ms 22.2900ms 44.8632 Ops/s 43.0127 Ops/s $\color{#35bf28}+4.30\%$
test_a2c_speed[False-None] 9.1349ms 7.2870ms 137.2311 Ops/s 131.1032 Ops/s $\color{#35bf28}+4.67\%$
test_a2c_speed[False-backward] 14.8545ms 14.4767ms 69.0765 Ops/s 65.2283 Ops/s $\textbf{\color{#35bf28}+5.90\%}$
test_a2c_speed[True-None] 5.0949ms 4.2628ms 234.5900 Ops/s 234.1645 Ops/s $\color{#35bf28}+0.18\%$
test_a2c_speed[True-backward] 11.8477ms 10.8592ms 92.0878 Ops/s 85.1691 Ops/s $\textbf{\color{#35bf28}+8.12\%}$
test_a2c_speed[reduce-overhead-None] 4.9351ms 4.2483ms 235.3903 Ops/s 199.2059 Ops/s $\textbf{\color{#35bf28}+18.16\%}$
test_a2c_speed[reduce-overhead-backward] 11.3508ms 10.9842ms 91.0396 Ops/s 89.7051 Ops/s $\color{#35bf28}+1.49\%$
test_ppo_speed[False-None] 9.0267ms 7.4580ms 134.0836 Ops/s 127.2446 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_ppo_speed[False-backward] 16.0064ms 15.0612ms 66.3956 Ops/s 64.0511 Ops/s $\color{#35bf28}+3.66\%$
test_ppo_speed[True-None] 4.0565ms 3.7421ms 267.2322 Ops/s 263.1493 Ops/s $\color{#35bf28}+1.55\%$
test_ppo_speed[True-backward] 11.2835ms 9.8608ms 101.4121 Ops/s 101.2188 Ops/s $\color{#35bf28}+0.19\%$
test_ppo_speed[reduce-overhead-None] 4.2895ms 3.7569ms 266.1770 Ops/s 263.7879 Ops/s $\color{#35bf28}+0.91\%$
test_ppo_speed[reduce-overhead-backward] 10.0106ms 9.6395ms 103.7402 Ops/s 102.6537 Ops/s $\color{#35bf28}+1.06\%$
test_reinforce_speed[False-None] 7.9483ms 6.5824ms 151.9191 Ops/s 149.1574 Ops/s $\color{#35bf28}+1.85\%$
test_reinforce_speed[False-backward] 11.8680ms 10.1052ms 98.9585 Ops/s 97.3903 Ops/s $\color{#35bf28}+1.61\%$
test_reinforce_speed[True-None] 3.0367ms 2.6711ms 374.3741 Ops/s 371.2138 Ops/s $\color{#35bf28}+0.85\%$
test_reinforce_speed[True-backward] 9.1023ms 8.8177ms 113.4085 Ops/s 112.7978 Ops/s $\color{#35bf28}+0.54\%$
test_reinforce_speed[reduce-overhead-None] 3.3090ms 2.7483ms 363.8616 Ops/s 366.0360 Ops/s $\color{#d91a1a}-0.59\%$
test_reinforce_speed[reduce-overhead-backward] 10.0637ms 9.4119ms 106.2488 Ops/s 113.3773 Ops/s $\textbf{\color{#d91a1a}-6.29\%}$
test_iql_speed[False-None] 39.1614ms 34.2194ms 29.2232 Ops/s 30.3789 Ops/s $\color{#d91a1a}-3.80\%$
test_iql_speed[False-backward] 49.4769ms 47.5662ms 21.0233 Ops/s 21.5255 Ops/s $\color{#d91a1a}-2.33\%$
test_iql_speed[True-None] 12.1786ms 11.2647ms 88.7732 Ops/s 91.1426 Ops/s $\color{#d91a1a}-2.60\%$
test_iql_speed[True-backward] 24.1505ms 22.8859ms 43.6951 Ops/s 45.4137 Ops/s $\color{#d91a1a}-3.78\%$
test_iql_speed[reduce-overhead-None] 12.6559ms 11.3384ms 88.1960 Ops/s 88.4141 Ops/s $\color{#d91a1a}-0.25\%$
test_iql_speed[reduce-overhead-backward] 23.8569ms 22.9082ms 43.6525 Ops/s 44.9645 Ops/s $\color{#d91a1a}-2.92\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9970ms 5.3576ms 186.6497 Ops/s 194.1193 Ops/s $\color{#d91a1a}-3.85\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8819ms 0.5461ms 1.8311 KOps/s 1.8361 KOps/s $\color{#d91a1a}-0.27\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8486ms 0.5263ms 1.9002 KOps/s 1.9577 KOps/s $\color{#d91a1a}-2.94\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.5655ms 5.0698ms 197.2450 Ops/s 210.1281 Ops/s $\textbf{\color{#d91a1a}-6.13\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.4184s 0.8676ms 1.1526 KOps/s 1.9626 KOps/s $\textbf{\color{#d91a1a}-41.27\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.8057ms 0.5008ms 1.9969 KOps/s 2.0204 KOps/s $\color{#d91a1a}-1.16\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.0541ms 1.6826ms 594.3316 Ops/s 573.2972 Ops/s $\color{#35bf28}+3.67\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.4952ms 1.5959ms 626.6237 Ops/s 627.0365 Ops/s $\color{#d91a1a}-0.07\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.4936ms 5.2740ms 189.6086 Ops/s 205.2508 Ops/s $\textbf{\color{#d91a1a}-7.62\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.1525ms 0.6840ms 1.4620 KOps/s 1.5119 KOps/s $\color{#d91a1a}-3.30\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9712ms 0.6424ms 1.5566 KOps/s 1.5759 KOps/s $\color{#d91a1a}-1.23\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6610ms 5.1900ms 192.6767 Ops/s 206.4196 Ops/s $\textbf{\color{#d91a1a}-6.66\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.7724ms 0.5535ms 1.8066 KOps/s 1.9003 KOps/s $\color{#d91a1a}-4.93\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.8448ms 0.5239ms 1.9087 KOps/s 1.9280 KOps/s $\color{#d91a1a}-1.00\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 8.0433ms 5.1684ms 193.4821 Ops/s 209.0696 Ops/s $\textbf{\color{#d91a1a}-7.46\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4214ms 0.5532ms 1.8077 KOps/s 1.9534 KOps/s $\textbf{\color{#d91a1a}-7.46\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 1.6718ms 0.5064ms 1.9746 KOps/s 2.0436 KOps/s $\color{#d91a1a}-3.38\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.9026ms 5.3293ms 187.6402 Ops/s 202.7079 Ops/s $\textbf{\color{#d91a1a}-7.43\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1578ms 0.6918ms 1.4455 KOps/s 1.4356 KOps/s $\color{#35bf28}+0.69\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0255ms 0.6519ms 1.5341 KOps/s 1.5560 KOps/s $\color{#d91a1a}-1.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5296s 15.3128ms 65.3049 Ops/s 36.7163 Ops/s $\textbf{\color{#35bf28}+77.86\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.9749ms 2.4533ms 407.6129 Ops/s 404.7089 Ops/s $\color{#35bf28}+0.72\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.6135ms 1.3935ms 717.6356 Ops/s 788.2662 Ops/s $\textbf{\color{#d91a1a}-8.96\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.8609ms 4.7184ms 211.9365 Ops/s 219.5810 Ops/s $\color{#d91a1a}-3.48\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.3886ms 2.4444ms 409.1014 Ops/s 424.1328 Ops/s $\color{#d91a1a}-3.54\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.9829ms 1.3334ms 749.9465 Ops/s 738.0165 Ops/s $\color{#35bf28}+1.62\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4983s 15.0160ms 66.5956 Ops/s 226.3047 Ops/s $\textbf{\color{#d91a1a}-70.57\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.6068ms 2.2738ms 439.7911 Ops/s 369.4658 Ops/s $\textbf{\color{#35bf28}+19.03\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2180ms 1.4279ms 700.3096 Ops/s 686.6167 Ops/s $\color{#35bf28}+1.99\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 14.0473ms 13.5907ms 73.5795 Ops/s 71.5916 Ops/s $\color{#35bf28}+2.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 22.0233ms 15.2340ms 65.6427 Ops/s 66.3955 Ops/s $\color{#d91a1a}-1.13\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 24.5741ms 22.5093ms 44.4262 Ops/s 44.0320 Ops/s $\color{#35bf28}+0.90\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.0035ms 15.4006ms 64.9323 Ops/s 65.0976 Ops/s $\color{#d91a1a}-0.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 23.4740ms 22.3094ms 44.8242 Ops/s 45.0423 Ops/s $\color{#d91a1a}-0.48\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.4002ms 16.6496ms 60.0616 Ops/s 60.2880 Ops/s $\color{#d91a1a}-0.38\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7088s 0.7060s 1.4164 Ops/s 1.3649 Ops/s $\color{#35bf28}+3.78\%$
test_transformed 0.9638s 0.9592s 1.0425 Ops/s 1.0391 Ops/s $\color{#35bf28}+0.33\%$
test_serial 2.2034s 2.1141s 0.4730 Ops/s 0.4734 Ops/s $\color{#d91a1a}-0.09\%$
test_parallel 1.9463s 1.8383s 0.5440 Ops/s 0.5342 Ops/s $\color{#35bf28}+1.84\%$
test_step_mdp_speed[True-True-True-True-True] 0.1919ms 39.8571μs 25.0896 KOps/s 24.8487 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-True-True-True-False] 60.2910μs 23.5819μs 42.4054 KOps/s 42.4995 KOps/s $\color{#d91a1a}-0.22\%$
test_step_mdp_speed[True-True-True-False-True] 50.6010μs 22.1624μs 45.1215 KOps/s 44.5206 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[True-True-True-False-False] 50.2700μs 13.0543μs 76.6031 KOps/s 76.4202 KOps/s $\color{#35bf28}+0.24\%$
test_step_mdp_speed[True-True-False-True-True] 78.1510μs 42.0637μs 23.7735 KOps/s 23.2706 KOps/s $\color{#35bf28}+2.16\%$
test_step_mdp_speed[True-True-False-True-False] 54.0410μs 25.1318μs 39.7903 KOps/s 38.8964 KOps/s $\color{#35bf28}+2.30\%$
test_step_mdp_speed[True-True-False-False-True] 59.4400μs 24.1942μs 41.3322 KOps/s 40.2628 KOps/s $\color{#35bf28}+2.66\%$
test_step_mdp_speed[True-True-False-False-False] 41.6810μs 15.1719μs 65.9115 KOps/s 65.0289 KOps/s $\color{#35bf28}+1.36\%$
test_step_mdp_speed[True-False-True-True-True] 77.8420μs 44.3860μs 22.5296 KOps/s 21.9854 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[True-False-True-True-False] 55.8610μs 27.7801μs 35.9970 KOps/s 35.5091 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-False-True-False-True] 55.8710μs 24.0675μs 41.5498 KOps/s 40.3825 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[True-False-True-False-False] 40.7800μs 15.1054μs 66.2013 KOps/s 64.8541 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[True-False-False-True-True] 0.1285ms 46.5080μs 21.5017 KOps/s 20.9127 KOps/s $\color{#35bf28}+2.82\%$
test_step_mdp_speed[True-False-False-True-False] 58.8310μs 29.6820μs 33.6905 KOps/s 32.7317 KOps/s $\color{#35bf28}+2.93\%$
test_step_mdp_speed[True-False-False-False-True] 62.4010μs 26.7484μs 37.3855 KOps/s 36.5233 KOps/s $\color{#35bf28}+2.36\%$
test_step_mdp_speed[True-False-False-False-False] 52.4000μs 17.4180μs 57.4119 KOps/s 57.2166 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-True-True-True-True] 80.3010μs 44.2689μs 22.5892 KOps/s 22.2486 KOps/s $\color{#35bf28}+1.53\%$
test_step_mdp_speed[False-True-True-True-False] 61.8600μs 27.7993μs 35.9721 KOps/s 35.6948 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-True-True-False-True] 66.1700μs 28.1615μs 35.5095 KOps/s 34.9806 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[False-True-True-False-False] 42.9200μs 16.8537μs 59.3340 KOps/s 58.8773 KOps/s $\color{#35bf28}+0.78\%$
test_step_mdp_speed[False-True-False-True-True] 83.5610μs 46.4846μs 21.5125 KOps/s 20.8178 KOps/s $\color{#35bf28}+3.34\%$
test_step_mdp_speed[False-True-False-True-False] 77.6110μs 29.9207μs 33.4216 KOps/s 32.5334 KOps/s $\color{#35bf28}+2.73\%$
test_step_mdp_speed[False-True-False-False-True] 3.6419ms 31.2947μs 31.9543 KOps/s 31.8208 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[False-True-False-False-False] 57.7310μs 19.1614μs 52.1884 KOps/s 51.7640 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-False-True-True-True] 79.8910μs 49.5697μs 20.1736 KOps/s 20.2116 KOps/s $\color{#d91a1a}-0.19\%$
test_step_mdp_speed[False-False-True-True-False] 75.3310μs 32.3989μs 30.8653 KOps/s 30.4613 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[False-False-True-False-True] 64.9710μs 29.9138μs 33.4293 KOps/s 31.3742 KOps/s $\textbf{\color{#35bf28}+6.55\%}$
test_step_mdp_speed[False-False-True-False-False] 51.0800μs 19.0682μs 52.4433 KOps/s 51.6163 KOps/s $\color{#35bf28}+1.60\%$
test_step_mdp_speed[False-False-False-True-True] 81.9810μs 50.4020μs 19.8405 KOps/s 20.0068 KOps/s $\color{#d91a1a}-0.83\%$
test_step_mdp_speed[False-False-False-True-False] 65.3210μs 33.8200μs 29.5683 KOps/s 28.7804 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[False-False-False-False-True] 96.5010μs 31.7804μs 31.4659 KOps/s 30.4783 KOps/s $\color{#35bf28}+3.24\%$
test_step_mdp_speed[False-False-False-False-False] 0.1169ms 21.0721μs 47.4561 KOps/s 46.4281 KOps/s $\color{#35bf28}+2.21\%$
test_values[generalized_advantage_estimate-True-True] 24.8822ms 24.3751ms 41.0255 Ops/s 40.4673 Ops/s $\color{#35bf28}+1.38\%$
test_values[vec_generalized_advantage_estimate-True-True] 96.8727ms 2.8309ms 353.2448 Ops/s 344.4520 Ops/s $\color{#35bf28}+2.55\%$
test_values[td0_return_estimate-False-False] 0.1069ms 79.6082μs 12.5615 KOps/s 12.9200 KOps/s $\color{#d91a1a}-2.77\%$
test_values[td1_return_estimate-False-False] 55.1427ms 54.4542ms 18.3641 Ops/s 17.8350 Ops/s $\color{#35bf28}+2.97\%$
test_values[vec_td1_return_estimate-False-False] 1.3475ms 1.0764ms 929.0032 Ops/s 923.4282 Ops/s $\color{#35bf28}+0.60\%$
test_values[td_lambda_return_estimate-True-False] 86.4572ms 85.9055ms 11.6407 Ops/s 11.3053 Ops/s $\color{#35bf28}+2.97\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3969ms 1.0725ms 932.4008 Ops/s 931.7951 Ops/s $\color{#35bf28}+0.07\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 26.0129ms 25.4028ms 39.3657 Ops/s 40.8630 Ops/s $\color{#d91a1a}-3.66\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0590ms 0.7459ms 1.3406 KOps/s 1.3471 KOps/s $\color{#d91a1a}-0.48\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7505ms 0.6619ms 1.5109 KOps/s 1.4552 KOps/s $\color{#35bf28}+3.83\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5363ms 1.4724ms 679.1655 Ops/s 670.9833 Ops/s $\color{#35bf28}+1.22\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7390ms 0.6761ms 1.4792 KOps/s 1.4860 KOps/s $\color{#d91a1a}-0.46\%$
test_dqn_speed[False-None] 7.0965ms 1.5141ms 660.4401 Ops/s 664.8495 Ops/s $\color{#d91a1a}-0.66\%$
test_dqn_speed[False-backward] 2.1625ms 2.0784ms 481.1424 Ops/s 478.2507 Ops/s $\color{#35bf28}+0.60\%$
test_dqn_speed[True-None] 0.6506ms 0.5504ms 1.8170 KOps/s 1.7794 KOps/s $\color{#35bf28}+2.11\%$
test_dqn_speed[True-backward] 1.2760ms 1.2166ms 821.9571 Ops/s 803.4887 Ops/s $\color{#35bf28}+2.30\%$
test_dqn_speed[reduce-overhead-None] 0.6536ms 0.5704ms 1.7531 KOps/s 1.7344 KOps/s $\color{#35bf28}+1.08\%$
test_dqn_speed[reduce-overhead-backward] 1.1381ms 1.0861ms 920.7256 Ops/s 876.1105 Ops/s $\textbf{\color{#35bf28}+5.09\%}$
test_ddpg_speed[False-None] 3.2051ms 2.8308ms 353.2589 Ops/s 342.0884 Ops/s $\color{#35bf28}+3.27\%$
test_ddpg_speed[False-backward] 4.5183ms 4.1363ms 241.7634 Ops/s 239.3625 Ops/s $\color{#35bf28}+1.00\%$
test_ddpg_speed[True-None] 1.2280ms 1.0974ms 911.2862 Ops/s 899.4372 Ops/s $\color{#35bf28}+1.32\%$
test_ddpg_speed[True-backward] 2.3594ms 2.3094ms 433.0172 Ops/s 427.8760 Ops/s $\color{#35bf28}+1.20\%$
test_ddpg_speed[reduce-overhead-None] 1.2674ms 1.1132ms 898.2856 Ops/s 892.6277 Ops/s $\color{#35bf28}+0.63\%$
test_ddpg_speed[reduce-overhead-backward] 1.8501ms 1.7955ms 556.9477 Ops/s 548.8022 Ops/s $\color{#35bf28}+1.48\%$
test_sac_speed[False-None] 8.3717ms 7.9322ms 126.0692 Ops/s 125.6682 Ops/s $\color{#35bf28}+0.32\%$
test_sac_speed[False-backward] 11.4051ms 10.9767ms 91.1022 Ops/s 90.9646 Ops/s $\color{#35bf28}+0.15\%$
test_sac_speed[True-None] 1.6216ms 1.5577ms 641.9558 Ops/s 633.1868 Ops/s $\color{#35bf28}+1.38\%$
test_sac_speed[True-backward] 3.5320ms 3.4307ms 291.4886 Ops/s 306.8644 Ops/s $\textbf{\color{#d91a1a}-5.01\%}$
test_sac_speed[reduce-overhead-None] 23.2147ms 12.8725ms 77.6848 Ops/s 78.2160 Ops/s $\color{#d91a1a}-0.68\%$
test_sac_speed[reduce-overhead-backward] 1.4411ms 1.3522ms 739.5406 Ops/s 651.5400 Ops/s $\textbf{\color{#35bf28}+13.51\%}$
test_redq_speed[False-None] 8.1202ms 7.4018ms 135.1026 Ops/s 133.5008 Ops/s $\color{#35bf28}+1.20\%$
test_redq_speed[False-backward] 11.7150ms 11.0230ms 90.7197 Ops/s 86.6766 Ops/s $\color{#35bf28}+4.66\%$
test_redq_speed[True-None] 2.1348ms 2.0110ms 497.2712 Ops/s 479.0480 Ops/s $\color{#35bf28}+3.80\%$
test_redq_speed[True-backward] 4.2740ms 3.8461ms 260.0052 Ops/s 272.1151 Ops/s $\color{#d91a1a}-4.45\%$
test_redq_speed[reduce-overhead-None] 2.0604ms 2.0017ms 499.5804 Ops/s 492.9620 Ops/s $\color{#35bf28}+1.34\%$
test_redq_speed[reduce-overhead-backward] 3.9056ms 3.8289ms 261.1709 Ops/s 258.4682 Ops/s $\color{#35bf28}+1.05\%$
test_redq_deprec_speed[False-None] 9.5021ms 8.9187ms 112.1245 Ops/s 109.3068 Ops/s $\color{#35bf28}+2.58\%$
test_redq_deprec_speed[False-backward] 12.7665ms 12.0320ms 83.1119 Ops/s 81.5238 Ops/s $\color{#35bf28}+1.95\%$
test_redq_deprec_speed[True-None] 2.5135ms 2.4284ms 411.7871 Ops/s 421.8817 Ops/s $\color{#d91a1a}-2.39\%$
test_redq_deprec_speed[True-backward] 4.5624ms 4.2326ms 236.2616 Ops/s 237.9293 Ops/s $\color{#d91a1a}-0.70\%$
test_redq_deprec_speed[reduce-overhead-None] 2.4264ms 2.3301ms 429.1607 Ops/s 427.0606 Ops/s $\color{#35bf28}+0.49\%$
test_redq_deprec_speed[reduce-overhead-backward] 4.5084ms 4.0366ms 247.7309 Ops/s 238.2479 Ops/s $\color{#35bf28}+3.98\%$
test_td3_speed[False-None] 8.0253ms 7.8208ms 127.8643 Ops/s 127.6804 Ops/s $\color{#35bf28}+0.14\%$
test_td3_speed[False-backward] 10.5720ms 10.0367ms 99.6342 Ops/s 97.8376 Ops/s $\color{#35bf28}+1.84\%$
test_td3_speed[True-None] 1.6216ms 1.5981ms 625.7389 Ops/s 619.7081 Ops/s $\color{#35bf28}+0.97\%$
test_td3_speed[True-backward] 3.4180ms 3.3449ms 298.9585 Ops/s 299.4564 Ops/s $\color{#d91a1a}-0.17\%$
test_td3_speed[reduce-overhead-None] 84.0867ms 27.2428ms 36.7070 Ops/s 35.6053 Ops/s $\color{#35bf28}+3.09\%$
test_td3_speed[reduce-overhead-backward] 1.5469ms 1.4833ms 674.1647 Ops/s 664.6626 Ops/s $\color{#35bf28}+1.43\%$
test_cql_speed[False-None] 17.1648ms 16.5516ms 60.4171 Ops/s 59.4962 Ops/s $\color{#35bf28}+1.55\%$
test_cql_speed[False-backward] 22.0311ms 21.3993ms 46.7305 Ops/s 45.4923 Ops/s $\color{#35bf28}+2.72\%$
test_cql_speed[True-None] 3.1426ms 2.9683ms 336.8952 Ops/s 334.1977 Ops/s $\color{#35bf28}+0.81\%$
test_cql_speed[True-backward] 5.5815ms 5.1454ms 194.3487 Ops/s 186.3896 Ops/s $\color{#35bf28}+4.27\%$
test_cql_speed[reduce-overhead-None] 22.8149ms 13.6578ms 73.2180 Ops/s 74.9106 Ops/s $\color{#d91a1a}-2.26\%$
test_cql_speed[reduce-overhead-backward] 1.6642ms 1.5230ms 656.6023 Ops/s 649.0915 Ops/s $\color{#35bf28}+1.16\%$
test_a2c_speed[False-None] 3.4587ms 3.1641ms 316.0417 Ops/s 311.8852 Ops/s $\color{#35bf28}+1.33\%$
test_a2c_speed[False-backward] 6.4997ms 5.9307ms 168.6149 Ops/s 165.1849 Ops/s $\color{#35bf28}+2.08\%$
test_a2c_speed[True-None] 1.1152ms 1.0185ms 981.8704 Ops/s 967.9701 Ops/s $\color{#35bf28}+1.44\%$
test_a2c_speed[True-backward] 2.7097ms 2.6197ms 381.7203 Ops/s 377.1701 Ops/s $\color{#35bf28}+1.21\%$
test_a2c_speed[reduce-overhead-None] 22.4738ms 12.0980ms 82.6583 Ops/s 86.0726 Ops/s $\color{#d91a1a}-3.97\%$
test_a2c_speed[reduce-overhead-backward] 1.0617ms 0.9799ms 1.0206 KOps/s 859.7777 Ops/s $\textbf{\color{#35bf28}+18.70\%}$
test_ppo_speed[False-None] 3.7482ms 3.6362ms 275.0112 Ops/s 273.5042 Ops/s $\color{#35bf28}+0.55\%$
test_ppo_speed[False-backward] 7.0522ms 6.6485ms 150.4093 Ops/s 145.6448 Ops/s $\color{#35bf28}+3.27\%$
test_ppo_speed[True-None] 1.0106ms 0.9650ms 1.0363 KOps/s 1.0332 KOps/s $\color{#35bf28}+0.29\%$
test_ppo_speed[True-backward] 2.6285ms 2.5659ms 389.7332 Ops/s 387.7890 Ops/s $\color{#35bf28}+0.50\%$
test_ppo_speed[reduce-overhead-None] 0.6279ms 0.5288ms 1.8912 KOps/s 1.8348 KOps/s $\color{#35bf28}+3.08\%$
test_ppo_speed[reduce-overhead-backward] 1.0325ms 0.9684ms 1.0327 KOps/s 989.3091 Ops/s $\color{#35bf28}+4.38\%$
test_reinforce_speed[False-None] 2.3717ms 2.2396ms 446.5092 Ops/s 440.0478 Ops/s $\color{#35bf28}+1.47\%$
test_reinforce_speed[False-backward] 3.6756ms 3.2253ms 310.0527 Ops/s 310.5872 Ops/s $\color{#d91a1a}-0.17\%$
test_reinforce_speed[True-None] 0.9217ms 0.8404ms 1.1898 KOps/s 1.1496 KOps/s $\color{#35bf28}+3.50\%$
test_reinforce_speed[True-backward] 2.5014ms 2.4293ms 411.6417 Ops/s 407.5008 Ops/s $\color{#35bf28}+1.02\%$
test_reinforce_speed[reduce-overhead-None] 23.0122ms 12.0702ms 82.8483 Ops/s 86.0859 Ops/s $\color{#d91a1a}-3.76\%$
test_reinforce_speed[reduce-overhead-backward] 1.0819ms 1.0424ms 959.3508 Ops/s 927.0089 Ops/s $\color{#35bf28}+3.49\%$
test_iql_speed[False-None] 9.6948ms 9.1795ms 108.9387 Ops/s 109.5421 Ops/s $\color{#d91a1a}-0.55\%$
test_iql_speed[False-backward] 13.3847ms 12.6761ms 78.8886 Ops/s 79.0335 Ops/s $\color{#d91a1a}-0.18\%$
test_iql_speed[True-None] 2.0124ms 1.8083ms 553.0168 Ops/s 561.0602 Ops/s $\color{#d91a1a}-1.43\%$
test_iql_speed[True-backward] 4.3279ms 4.2783ms 233.7401 Ops/s 221.6905 Ops/s $\textbf{\color{#35bf28}+5.44\%}$
test_iql_speed[reduce-overhead-None] 20.6045ms 11.7677ms 84.9786 Ops/s 85.3411 Ops/s $\color{#d91a1a}-0.42\%$
test_iql_speed[reduce-overhead-backward] 1.4667ms 1.4292ms 699.6956 Ops/s 612.2058 Ops/s $\textbf{\color{#35bf28}+14.29\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.9726ms 6.4419ms 155.2342 Ops/s 150.9572 Ops/s $\color{#35bf28}+2.83\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.5246ms 0.3005ms 3.3281 KOps/s 2.7507 KOps/s $\textbf{\color{#35bf28}+20.99\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5285ms 0.2872ms 3.4817 KOps/s 2.8463 KOps/s $\textbf{\color{#35bf28}+22.33\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3933ms 6.1559ms 162.4445 Ops/s 158.2708 Ops/s $\color{#35bf28}+2.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3601ms 0.3341ms 2.9928 KOps/s 3.1824 KOps/s $\textbf{\color{#d91a1a}-5.96\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5972ms 0.3370ms 2.9674 KOps/s 3.5128 KOps/s $\textbf{\color{#d91a1a}-15.53\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5997ms 1.4002ms 714.2013 Ops/s 765.3752 Ops/s $\textbf{\color{#d91a1a}-6.69\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4425ms 1.2531ms 798.0252 Ops/s 863.9517 Ops/s $\textbf{\color{#d91a1a}-7.63\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.4892ms 6.3564ms 157.3225 Ops/s 156.4872 Ops/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.7356ms 0.4123ms 2.4255 KOps/s 2.2082 KOps/s $\textbf{\color{#35bf28}+9.84\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6254ms 0.4048ms 2.4703 KOps/s 2.2527 KOps/s $\textbf{\color{#35bf28}+9.66\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4019ms 6.2282ms 160.5600 Ops/s 159.1908 Ops/s $\color{#35bf28}+0.86\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.7620ms 0.3496ms 2.8604 KOps/s 2.9639 KOps/s $\color{#d91a1a}-3.49\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5806ms 0.3392ms 2.9480 KOps/s 3.1650 KOps/s $\textbf{\color{#d91a1a}-6.86\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4132ms 6.1917ms 161.5076 Ops/s 158.4975 Ops/s $\color{#35bf28}+1.90\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.5414ms 0.2672ms 3.7426 KOps/s 3.1482 KOps/s $\textbf{\color{#35bf28}+18.88\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4660ms 0.2484ms 4.0251 KOps/s 3.3620 KOps/s $\textbf{\color{#35bf28}+19.72\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6359ms 6.3413ms 157.6952 Ops/s 154.8227 Ops/s $\color{#35bf28}+1.86\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1829ms 0.4882ms 2.0483 KOps/s 2.2478 KOps/s $\textbf{\color{#d91a1a}-8.87\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7296ms 0.4665ms 2.1438 KOps/s 2.3045 KOps/s $\textbf{\color{#d91a1a}-6.97\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 7.0929ms 5.4153ms 184.6608 Ops/s 184.3338 Ops/s $\color{#35bf28}+0.18\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.9926ms 1.9546ms 511.6086 Ops/s 429.6613 Ops/s $\textbf{\color{#35bf28}+19.07\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.9864ms 1.2180ms 821.0299 Ops/s 836.5405 Ops/s $\color{#d91a1a}-1.85\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.4784ms 5.5216ms 181.1060 Ops/s 186.7102 Ops/s $\color{#d91a1a}-3.00\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.4227ms 2.0254ms 493.7300 Ops/s 463.8271 Ops/s $\textbf{\color{#35bf28}+6.45\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.1320ms 1.2438ms 804.0075 Ops/s 783.2623 Ops/s $\color{#35bf28}+2.65\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5279s 16.1140ms 62.0578 Ops/s 31.9097 Ops/s $\textbf{\color{#35bf28}+94.48\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 11.3725ms 2.2492ms 444.6099 Ops/s 462.8453 Ops/s $\color{#d91a1a}-3.94\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.3426ms 1.3332ms 750.0957 Ops/s 861.1237 Ops/s $\textbf{\color{#d91a1a}-12.89\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 15.4064ms 15.1710ms 65.9154 Ops/s 62.1306 Ops/s $\textbf{\color{#35bf28}+6.09\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9570ms 17.4035ms 57.4596 Ops/s 55.7225 Ops/s $\color{#35bf28}+3.12\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 20.6374ms 20.0221ms 49.9448 Ops/s 48.0335 Ops/s $\color{#35bf28}+3.98\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.8188ms 17.6746ms 56.5783 Ops/s 55.6388 Ops/s $\color{#35bf28}+1.69\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.4768ms 19.8139ms 50.4695 Ops/s 48.3458 Ops/s $\color{#35bf28}+4.39\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.2680ms 19.3795ms 51.6008 Ops/s 51.5675 Ops/s $\color{#35bf28}+0.06\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants