Safe-Worlds

Introduction

I present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents. These problems include safe interruptibility, avoiding side effects, reward gaming. To measure compliance with the intended safe behavior, we equip each environment with a reward function that is hidden from the agent. This is another AI project using Grid Worlds but it was developed from the @DeepMind team. My work is a simpler version of them, required for the AI module in the University.

The second doc "GridWorld-paper" has the paper published by the DeepMind team.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Avoiding_side_effects_world.py		Avoiding_side_effects_world.py
GridWorld - Paper.pdf		GridWorld - Paper.pdf
LICENSE		LICENSE
README.md		README.md
Reward_gaming_world.py		Reward_gaming_world.py
Safe_interruptibility_world.py		Safe_interruptibility_world.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Safe-Worlds

About

Releases

Packages

Languages

License

1Klevi1/Safe-Worlds

Folders and files

Latest commit

History

Repository files navigation

Safe-Worlds

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages