Implementation of Reinforcement Learning using Reinforcement.js in my Maze Generation Algorithm.
Main api for the environment is taken from Maze Generation Algorithm. But, in the world (implemented for reinforcement learning) has following things:
reset
to reset the game state.reward
return reward from current state, action and next state.sampleNextState
calculate next state and reward from an action suggested by agent.allowedActions
return allowed actions at current cell.
Beside everything in world call are helpers.
The rewards given to the agent are under following conditions:
-0.01
on every move. So that agent don't stuck on same place.- When agent solves the puzzle give positive reward equal to the size of puzzle.
-0.1
on start, top right and bottom left tiles.
Spec | Value |
---|---|
Discount Factor (gamma) | 0.9 |
Epsilon-greedy Policy | 0.2 |
Learning Rate (alpha) | 0.3 |
Eligibility trace decay (lambda) | 0 |
Replacing Traces | true |
Number of planning steps per iteration | 50 |
Smooth Policy Update | true |
Learning Rate for Smooth Policy (beta) | 0.1 |