Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor #347

Open
wants to merge 33 commits into
base: delete_marlgrid
Choose a base branch
from
Open

Refactor #347

wants to merge 33 commits into from

Conversation

dapatil211
Copy link
Collaborator

No description provided.

dapatil211 and others added 30 commits February 28, 2023 17:30
* Add an evaluation before training begins
* Adding initial PPO Code

* Added buffer sampling and solved some bugs

* ppo agent: device and type errors fixed

* ppo updater fixed

* ppo config updated

* Updated Hive to use gym spaces instead of raw tuples to represent action and observation spaces

* Updated tests to affect api change of 1106ec2

* Adding initial PPO Code

* Added buffer sampling and solved some bugs

* ppo agent: device and type errors fixed

* ppo updater fixed

* ppo config updated

* ppo replay added

* ppo replay fixed

* ppo agent updated

* ppo agent and config updated

* ppo code running but buggy

* cartpole working

* ppo configs

* ppo net fixed

* atari configs added

* ppo_nets done

* ppo_replay done

* ppo env wrappers added

* ppo agent done

* configs done

* stack size > 1 handled temporarily

* linting fixed

* last batch drop fix

* config changes

* shared network added

* reward wrapper added

* linting fixed

* docs fixed

* replay changed

* update loop

* type specification

* env wrappers registered

* linting fixed

* Removed one off transition, cleaned up replay buffer

* Fixed linter issues

* wrapper error fixed

* added vars to dict; fixed long lines and var names; moved wrapper registry

* config fixed

* addded normalisation and fixed log

* norm filed added

* norm bug fixed

* rew norm updated

* fixes

* fixing norm bug; config

* config fixes

* obs norm

* hardcoded wrappers added

* normaliser shape fixed

* rew shape fixed; norm structure updated

* rew norm

* configs and wrapper fixed

* Fixed formatting and naming

* Added env wrapper logic

* Renamed PPO Replay Buffer to On Policy Replay buffer

* Made PPO Stateless Agent

* Fixed linting issues

* Minor modifications

* Fixed changed

* Formatting and minor changes

* Refactored Advatange Computation

* Reformating with black

* Renaming

* Refactored Normalization code

* Added saving and loading of state dict for normalizers

* Fixed multiplayer replay buffer for PPO

* Fixed minor bug

* Renamed file

* Added lr annealing

---------

Co-authored-by: Sriyash <[email protected]>
Co-authored-by: sriyash.poddar <[email protected]>
Co-authored-by: Darshan Patil <[email protected]>
Co-authored-by: sriyash.poddar <[email protected]>
Co-authored-by: sriyash.poddar <[email protected]>
Co-authored-by: sriyash.poddar <[email protected]>
Co-authored-by: sriyash.poddar <[email protected]>
Co-authored-by: Sriyash Poddar <[email protected]>
* Fixed issues with moving from done to terminated, truncated

* Undo change to logging' scales

* Revert changes to this file, SAC agents don't exist yet

* Clean up test file
* Fixed issues with moving from done to terminated, truncated

* Undo change to logging' scales

* Revert changes to this file, SAC agents don't exist yet

* Clean up test file

* Added SAC agents

* Minor code cleanup
* first version. testing.

* pylint

* pylint

* pylint

* pylint

* pylint

* test fix

* pylint

* pylint

* resolve discussion

---------

Co-authored-by: artem.zholus <[email protected]>
Co-authored-by: Darshan Patil <[email protected]>
* Added option to initialize separate components differently

* Made minor fixes

* Fixed init and registraion

* Fix term trunc (#336) (#341)

* Fixed issues with moving from done to terminated, truncated

* Undo change to logging' scales

* Revert changes to this file, SAC agents don't exist yet

* Clean up test file

Co-authored-by: Darshan Patil <[email protected]>

---------

Co-authored-by: Darshan Patil <[email protected]>
* restore hidden states added

* burn in frames feature added

* a more general recurrent agent

* stateless drqn agent

* Remove AtariEnv (#334)

* first version. testing.

* pylint

* pylint

* pylint

* pylint

* pylint

* test fix

* pylint

* pylint

* resolve discussion

---------

Co-authored-by: artem.zholus <[email protected]>
Co-authored-by: Darshan Patil <[email protected]>

* Update pull_request_ci.yml (#342)

* Add initialisation (#340)

* Added option to initialize separate components differently

* Made minor fixes

* Fixed init and registraion

* Fix term trunc (#336) (#341)

* Fixed issues with moving from done to terminated, truncated

* Undo change to logging' scales

* Revert changes to this file, SAC agents don't exist yet

* Clean up test file

Co-authored-by: Darshan Patil <[email protected]>

---------

Co-authored-by: Darshan Patil <[email protected]>

* new sequence model version

* restore hidden states added

* burn in frames feature added

* a more general recurrent agent

* stateless drqn agent

* new sequence model version

* Update Sequence model to remove coupling with LSTM/GRU

* Fix issue with agent_traj_state vs hidden_state

* Update config to match new atari env

* Fixed issue with storing hidden states in replay buffer

---------

Co-authored-by: Artem Zholus <[email protected]>
Co-authored-by: artem.zholus <[email protected]>
Co-authored-by: Darshan Patil <[email protected]>
Co-authored-by: Kshitij Gupta <[email protected]>
* restore hidden states added

* burn in frames feature added

* a more general recurrent agent

* stateless drqn agent

* Remove padding in sampling, Add masking in loss

* Reformatting with blck

* Remove the commented line

* new sequence model version

* restore hidden states added

* burn in frames feature added

* a more general recurrent agent

* stateless drqn agent

* new sequence model version

* Update Sequence model to remove coupling with LSTM/GRU

* Fix issue with agent_traj_state vs hidden_state

* Update config to match new atari env

* Fixed issue with storing hidden states in replay buffer

---------

Co-authored-by: Your Name <[email protected]>
Co-authored-by: Darshan Patil <[email protected]>
* Add modified files

* Add more cleaning

* Black Linting applied

* Fix bugs

* Delete Atari qnet

* Remove NatureAtariDQNModel from repo

---------

Co-authored-by: Darshan Patil <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants