title

section

openreview

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

volume

genre

issued

pdf

extras

Autonomous Robotic Reinforcement Learning with Asynchronous Human Feedback

Poster

z3D__-nc9y

Ideally, we would place a robot in a real-world environment and leave it there improving on its own by gathering more experience autonomously. However, algorithms for autonomous robotic learning have been challenging to realize in the real world. While this has often been attributed to the challenge of sample complexity, even sample-efficient techniques are hampered by two major challenges - the difficulty of providing well “shaped" rewards, and the difficulty of continual reset-free training. In this work, we describe a system for real-world reinforcement learning that enables agents to show continual improvement by training directly in the real world without requiring painstaking effort to hand-design reward functions or reset mechanisms. Our system leverages occasional non-expert human-in-the-loop feedback from remote users to learn informative distance functions to guide exploration while leveraging a simple self-supervised learning algorithm for goal-directed policy learning. We show that in the absence of resets, it is particularly important to account for the current “reachability" of the exploration policy when deciding which regions of the space to explore. Based on this insight, we instantiate a practical learning system - GEAR, which enables robots to simply be placed in real-world environments and left to train autonomously without interruption. The system streams robot experience to a web interface only requiring occasional asynchronous feedback from remote, crowdsourced, non-expert humans in the form of binary comparative feedback. We evaluate this system on a suite of robotic tasks in simulation and demonstrate its effectiveness at learning behaviors both in simulation and the real world. Project website https://guided-exploration-autonomous-rl.github.io/GEAR/.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

balsells23a

0

Autonomous Robotic Reinforcement Learning with Asynchronous Human Feedback

774

799

774-799

774

false

Balsells, Max and Villasevil, Marcel Torne and Wang, Zihan and Desai, Samedh and Agrawal, Pulkit and Gupta, Abhishek

given	family
Max	Balsells

given	family
Marcel Torne	Villasevil

given	family
Zihan	Wang

given	family
Samedh	Desai

given	family
Pulkit	Agrawal

given	family
Abhishek	Gupta

2023-12-02

Proceedings of The 7th Conference on Robot Learning

229

inproceedings

date-parts

2023

12

2

https://proceedings.mlr.press/v229/balsells23a/balsells23a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2023-12-02-balsells23a.md

2023-12-02-balsells23a.md

Files

2023-12-02-balsells23a.md

Latest commit

History

2023-12-02-balsells23a.md

File metadata and controls