-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless schedules #345
Open
dapatil211
wants to merge
18
commits into
dev
Choose a base branch
from
stateless_schedules
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Stateless schedules #345
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add an evaluation before training begins
* Adding initial PPO Code * Added buffer sampling and solved some bugs * ppo agent: device and type errors fixed * ppo updater fixed * ppo config updated * Updated Hive to use gym spaces instead of raw tuples to represent action and observation spaces * Updated tests to affect api change of 1106ec2 * Adding initial PPO Code * Added buffer sampling and solved some bugs * ppo agent: device and type errors fixed * ppo updater fixed * ppo config updated * ppo replay added * ppo replay fixed * ppo agent updated * ppo agent and config updated * ppo code running but buggy * cartpole working * ppo configs * ppo net fixed * atari configs added * ppo_nets done * ppo_replay done * ppo env wrappers added * ppo agent done * configs done * stack size > 1 handled temporarily * linting fixed * last batch drop fix * config changes * shared network added * reward wrapper added * linting fixed * docs fixed * replay changed * update loop * type specification * env wrappers registered * linting fixed * Removed one off transition, cleaned up replay buffer * Fixed linter issues * wrapper error fixed * added vars to dict; fixed long lines and var names; moved wrapper registry * config fixed * addded normalisation and fixed log * norm filed added * norm bug fixed * rew norm updated * fixes * fixing norm bug; config * config fixes * obs norm * hardcoded wrappers added * normaliser shape fixed * rew shape fixed; norm structure updated * rew norm * configs and wrapper fixed * Fixed formatting and naming * Added env wrapper logic * Renamed PPO Replay Buffer to On Policy Replay buffer * Made PPO Stateless Agent * Fixed linting issues * Minor modifications * Fixed changed * Formatting and minor changes * Refactored Advatange Computation * Reformating with black * Renaming * Refactored Normalization code * Added saving and loading of state dict for normalizers * Fixed multiplayer replay buffer for PPO * Fixed minor bug * Renamed file * Added lr annealing --------- Co-authored-by: Sriyash <[email protected]> Co-authored-by: sriyash.poddar <[email protected]> Co-authored-by: Darshan Patil <[email protected]> Co-authored-by: sriyash.poddar <[email protected]> Co-authored-by: sriyash.poddar <[email protected]> Co-authored-by: sriyash.poddar <[email protected]> Co-authored-by: sriyash.poddar <[email protected]> Co-authored-by: Sriyash Poddar <[email protected]>
* Fixed issues with moving from done to terminated, truncated * Undo change to logging' scales * Revert changes to this file, SAC agents don't exist yet * Clean up test file
* Fixed issues with moving from done to terminated, truncated * Undo change to logging' scales * Revert changes to this file, SAC agents don't exist yet * Clean up test file * Added SAC agents * Minor code cleanup
* first version. testing. * pylint * pylint * pylint * pylint * pylint * test fix * pylint * pylint * resolve discussion --------- Co-authored-by: artem.zholus <[email protected]> Co-authored-by: Darshan Patil <[email protected]>
* Added option to initialize separate components differently * Made minor fixes * Fixed init and registraion * Fix term trunc (#336) (#341) * Fixed issues with moving from done to terminated, truncated * Undo change to logging' scales * Revert changes to this file, SAC agents don't exist yet * Clean up test file Co-authored-by: Darshan Patil <[email protected]> --------- Co-authored-by: Darshan Patil <[email protected]>
* restore hidden states added * burn in frames feature added * a more general recurrent agent * stateless drqn agent * Remove AtariEnv (#334) * first version. testing. * pylint * pylint * pylint * pylint * pylint * test fix * pylint * pylint * resolve discussion --------- Co-authored-by: artem.zholus <[email protected]> Co-authored-by: Darshan Patil <[email protected]> * Update pull_request_ci.yml (#342) * Add initialisation (#340) * Added option to initialize separate components differently * Made minor fixes * Fixed init and registraion * Fix term trunc (#336) (#341) * Fixed issues with moving from done to terminated, truncated * Undo change to logging' scales * Revert changes to this file, SAC agents don't exist yet * Clean up test file Co-authored-by: Darshan Patil <[email protected]> --------- Co-authored-by: Darshan Patil <[email protected]> * new sequence model version * restore hidden states added * burn in frames feature added * a more general recurrent agent * stateless drqn agent * new sequence model version * Update Sequence model to remove coupling with LSTM/GRU * Fix issue with agent_traj_state vs hidden_state * Update config to match new atari env * Fixed issue with storing hidden states in replay buffer --------- Co-authored-by: Artem Zholus <[email protected]> Co-authored-by: artem.zholus <[email protected]> Co-authored-by: Darshan Patil <[email protected]> Co-authored-by: Kshitij Gupta <[email protected]>
* restore hidden states added * burn in frames feature added * a more general recurrent agent * stateless drqn agent * Remove padding in sampling, Add masking in loss * Reformatting with blck * Remove the commented line * new sequence model version * restore hidden states added * burn in frames feature added * a more general recurrent agent * stateless drqn agent * new sequence model version * Update Sequence model to remove coupling with LSTM/GRU * Fix issue with agent_traj_state vs hidden_state * Update config to match new atari env * Fixed issue with storing hidden states in replay buffer --------- Co-authored-by: Your Name <[email protected]> Co-authored-by: Darshan Patil <[email protected]>
* Add modified files * Add more cleaning * Black Linting applied * Fix bugs * Delete Atari qnet * Remove NatureAtariDQNModel from repo --------- Co-authored-by: Darshan Patil <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.