Q-Learning Applied to the Cart Pole Environment

In this repository, I've implemented both Deep Q-learning (DQN) and Tabular Q-learning on the cart pole environment.

For the tabular version of q-learning despite the state space being continuous for this problem, I discretize the values into buckets allowing it to function.

Both of the models fairly consistently solve the environment in about 600 episodes (with some randomness of course).

Behaviour

To allow the agent to explore the complex environment and build up an optimal policy without converging on a suboptimal policy too quickly we implement a decaying exploration and learning rate. The exploration rate decays as does the learning rate until after a large amount of exploration, the agent then exploits the more optimal path to balance the pole.

The model has no separate exploration and exploitation phases, but rather gradually moves towards more exploitation as time goes on. This means that in some cases where it fails to explore the environment fully before it begins to exploit it, it can take significantly longer to solve the problem - however, this is quite rare. The model will always explore even in the later stages, just at a significantly reduced amount.

Results

The learned model balancing the pole.

Runs

Inspired by / References

@book{ sutton_barto_2018, 
    title={Reinforcement Learning: An Introduction}, 
    publisher={MIT Press Ltd}, 
    author={Sutton, Richard S. and Barto, Andrew G.}, 
    year={2018}
}

@misc{ mnih_kavuk_silver_2015,
    title={Human-level control through deep reinforcement learning}
    url={https://www.nature.com/articles/nature14236},
    journal={Nature News},
    publisher={Nature Publishing Group},
    author={Volodymyr Mnih, Koray Kavukcuoglu, David Silver},
    year={2015}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
deep-q-learning.py		deep-q-learning.py
q-learning.py		q-learning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Q-Learning Applied to the Cart Pole Environment

Contents

Behaviour

Results

Runs

Inspired by / References

About

Releases

Packages

Languages

NathanS-Git/Q-Learning

Folders and files

Latest commit

History

Repository files navigation

Q-Learning Applied to the Cart Pole Environment

Contents

Behaviour

Results

Runs

Inspired by / References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages