title

abstract

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_editor

editor

bibtex_author

author

date

note

address

container-title

volume

genre

issued

pdf

extras

Dyna-style planning with linear function approximation and prioritized sweeping

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available after each interaction with the world. This paper develops an explicitly model-based approach extending the Dyna architecture to linear function approximation. Dyna-style planning proceeds by generating imaginary experience from the world model and then applying model-free reinforcement learning algorithms to the imagined state transitions. Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions. In the policy evaluation setting, we prove that the limit point is the least-squares (LSTD) solution. An implication of our results is that prioritized-sweeping can be soundly extended to the linear approximation case, backing up to preceding features rather than to preceding states. We introduce two versions of prioritized sweeping with linear Dyna and briefly illustrate their performance empirically on the Mountain Car and Boyan Chain problems.

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

sutton08a

0

Dyna-style planning with linear function approximation and prioritized sweeping

528

536

528-536

528

false

McAllester, David A. and Myllym{"a}ki, Petri

given	family
David A.	McAllester

given	family
Petri	Myllymäki

Sutton, Richard S. and Szepesv\'{a}ri, Csaba and Geramifard, Alborz and Bowling, Michael

given	family
Richard S.	Sutton

given	family
Csaba	Szepesvári

given	family
Alborz	Geramifard

given	family
Michael	Bowling

2008-07-09

Reissued by PMLR on 30 October 2024.

Proceedings of the 24th Conference on Uncertainty in Artificial Intelligence

R6

inproceedings

date-parts

2008

7

9

http://proceedings.mlr.press/r6/sutton08a/sutton08a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2008-07-09-sutton08a.md

2008-07-09-sutton08a.md

Files

2008-07-09-sutton08a.md

Latest commit

History

2008-07-09-sutton08a.md

File metadata and controls