OC-CleanRL

This fork enables the usage of OCAtari and HackAtari wrappers for Gymnasium instead of pure Gymnasium. OCAtari and HackAtari offer advanced wrappers that extract and use object-centered representations, enabling more interpretable observations and potentially improving training efficiency compared to raw pixel-based inputs. The goal is to use object-centered input representations instead of pure pixel-based ones. Currently, our experiments focus on the prominent Atari environment, particularly games like Pong, Breakout, and Space Invaders, to evaluate the effectiveness of object-centered representations.

ℹ️ Support for Gymnasium and v5
Farama-Foundation/Gymnasium is the next generation of openai/gym that will continue to be maintained and introduce new features. Please see their announcement for further details. CleanRL has already migrated to gymnasium (see vwxyzjn/cleanrl#277). Primarily, we base our training on the new v5 versions of the Atari games; see ALE-Farama.

About OCAtari

You can find the OCAtari repository at OCAtari GitHub.

OCAtari is a specialized wrapper designed for Atari environments that transforms pixel-based observations into object-centered representations. This allows agents to interact with the environment using more interpretable and structured inputs. By extracting meaningful object-level features, OCAtari enhances the efficiency and robustness of reinforcement learning models, especially in tasks where pixel-level noise can hinder performance.

CleanRL (Clean Implementation of RL Algorithms) - by Costa Huang

CleanRL is a Deep Reinforcement Learning library that provides high-quality, single-file implementations with research-friendly features. The implementation is clean and simple yet scalable to run thousands of experiments using AWS Batch. Key features of CleanRL include:

📜 Single-file implementation
- All details about an algorithm variant are in a single, standalone file.
- For example, ppo_atari.py has only 340 lines of code but contains all implementation details on how PPO works with Atari games. This makes it a great reference implementation for those who do not want to read an entire modular library.
📊 Benchmarked Implementations
- Explore 7+ algorithms and 34+ games at CleanRL Benchmarks.
📈 TensorBoard Logging
🪛 Local Reproducibility via Seeding
🎮 Gameplay Video Capturing
🧫 Experiment Management with Weights and Biases
💸 Cloud Integration
- Docker and AWS support for seamless scaling.

We keep this fork up to date with the original CleanRL master branch to enable further adaptations of algorithms for object-centered representations. For more details, you can:

⚠️ Note
CleanRL is not a modular library. This means it is not meant to be imported as a library. Instead, it is designed to make all implementation details of a DRL algorithm variant easy to understand, at the cost of duplicate code. Use CleanRL if you want to:

Understand all implementation details of an algorithm variant.

Prototype advanced features that modular DRL libraries may not support. CleanRL’s minimal lines of code enable easier debugging and avoid extensive subclassing required in modular libraries.

Getting Started with OC-CleanRL

Prerequisites

Python >= 3.9, < 3.13
pip

Running Experiments Locally

Clone the repository

git clone [email protected]:BluemlJ/oc_cleanrl.git --recursive && cd oc_cleanrl

Install dependencies

# Core dependencies
pip install -r requirements/requirements.txt

# Atari-specific dependencies
pip install -r requirements/requirements-atari.txt

Enable OCAtari/HackAtari
```
cd submodules/OC_Atari
pip install -e .
```

Start a training run

python cleanrl/ppo_atari_oc.py --env-id ALE/Pong-v5 --obs_mode obj --architecture PPO_OBJ --backend OCAtari

Tracking Results with W&B

You can track the results of training runs using Weights and Biases: W&B allows you to visualize key metrics, compare runs across different experiments, and easily share results with collaborators. For instance, you can monitor training progress, analyze model performance, and debug issues more effectively using W&B's interactive dashboards.

python cleanrl/ppo_atari_oc.py \
  --env-id ALE/${game_name}-v5 \
  --backend OCAtari \
  --obs_mode obj \
  --architecture PPO_OBJ \
  --track \
  --capture_video \
  --wandb-project-name OCAtari \
  --exp-name "obj_based_ppo"

Additional W&B settings can be adjusted directly in the training scripts.

Next Steps and Contributing

If you have any questions or need support, feel free to reach out by creating an issue on the GitHub repository.

Next Steps

Experiment with different Atari environments to explore object-centered representations further.
Compare the performance of pixel-based and object-centered models across various tasks.
Enable envpool and more methods to use object-centered inputs

Contributing

We welcome contributions to OC-CleanRL! If you'd like to contribute:

Fork the repository and create a new branch for your feature or bugfix.
Follow the existing coding style and add relevant tests.
Submit a pull request and include a detailed description of your changes.

Name		Name	Last commit message	Last commit date
Latest commit History 904 Commits
.github		.github
benchmark		benchmark
cleanrl		cleanrl
cleanrl_utils		cleanrl_utils
cloud		cloud
docs		docs
requirements		requirements
submodules		submodules
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.gitpod.Dockerfile		.gitpod.Dockerfile
.gitpod.yml		.gitpod.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
entrypoint.sh		entrypoint.sh
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
tuner_example.py		tuner_example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OC-CleanRL

About OCAtari

CleanRL (Clean Implementation of RL Algorithms) - by Costa Huang

Getting Started with OC-CleanRL

Prerequisites

Running Experiments Locally

Tracking Results with W&B

Next Steps and Contributing

Next Steps

Contributing

About

Releases

Packages

Languages

License

BluemlJ/oc_cleanrl

Folders and files

Latest commit

History

Repository files navigation

OC-CleanRL

About OCAtari

CleanRL (Clean Implementation of RL Algorithms) - by Costa Huang

Getting Started with OC-CleanRL

Prerequisites

Running Experiments Locally

Tracking Results with W&B

Next Steps and Contributing

Next Steps

Contributing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages