title | section | openreview | abstract | layout | series | publisher | issn | id | month | tex_title | firstpage | lastpage | page | order | cycles | bibtex_author | author | date | address | container-title | volume | genre | issued | extras | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning |
Oral |
xQx1O7WXSA |
Sampling-based motion planning algorithms excel at searching global solution paths in geometrically complex settings. However, classical approaches, such as RRT, are difficult to scale beyond low-dimensional search spaces and rely on privileged knowledge e.g. about collision detection and underlying state distances. In this work, we take a step towards the integration of sampling-based planning into the reinforcement learning framework to solve sparse-reward control tasks from high-dimensional inputs. Our method, called VELAP, determines sequences of waypoints through sampling-based exploration in a learned state embedding. Unlike other sampling-based techniques, we iteratively expand a tree-based memory of visited latent areas, which is leveraged to explore a larger portion of the latent space for a given number of search iterations. We demonstrate state-of-the-art results in learning control from offline data in the context of vision-based manipulation under sparse reward feedback. Our method extends the set of available planning tools in model-based reinforcement learning by adding a latent planner that searches globally for feasible paths instead of being bound to a fixed prediction horizon. |
inproceedings |
Proceedings of Machine Learning Research |
PMLR |
2640-3498 |
gieselmann23a |
0 |
Expansive Latent Planning for Sparse Reward Offline Reinforcement Learning |
1 |
22 |
1-22 |
1 |
false |
Gieselmann, Robert and Pokorny, Florian T. |
|
2023-12-02 |
Proceedings of The 7th Conference on Robot Learning |
229 |
inproceedings |
|