You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.
Thanks again for this amazing library that makes training RL agents extremely easy. I have a quick question about the act() function. This is supposed to be the function that is responsible for collecting the experiences of the agent in the environment. In this phase, the actor model is used which is different from the learner model. In PyTorch, as you might know, there are two different modalities: 'train' and 'eval'. I was expecting that the act() would call the model.eval() before starting collecting new experiences but it is not happening here: https://github.com/facebookresearch/torchbeast/blob/master/torchbeast/monobeast.py#L128
I have seen people arguing that in an RL setup is important to disable dropout to reduce the variance of the policy. This would be a side-effect of calling eval(). I can see that the default agent doesn't have any dropout so maybe this wasn't required in your case. What would you recommend?
The text was updated successfully, but these errors were encountered:
aleSuglia
changed the title
act() function doesn't use model in eval() mode
act() function doesn't use model in eval mode
Oct 31, 2020
this atari_net has a "training only" behavior in act periods if self.training: action = torch.multinomial(F.softmax(policy_logits, dim=1), num_samples=1)
so if you use model.eval during training
this action will skip above line so that always choose most greedy one, make this traning exploreing totally fail
as you say eval will be make dropout and normalize layer different
but this atari_net archtectur is too simple doesn't have such layer
so .eval will be function as same as .train
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hey guys,
Thanks again for this amazing library that makes training RL agents extremely easy. I have a quick question about the
act()
function. This is supposed to be the function that is responsible for collecting the experiences of the agent in the environment. In this phase, the actor model is used which is different from the learner model. In PyTorch, as you might know, there are two different modalities: 'train' and 'eval'. I was expecting that theact()
would call themodel.eval()
before starting collecting new experiences but it is not happening here: https://github.com/facebookresearch/torchbeast/blob/master/torchbeast/monobeast.py#L128I have seen people arguing that in an RL setup is important to disable dropout to reduce the variance of the policy. This would be a side-effect of calling
eval()
. I can see that the default agent doesn't have any dropout so maybe this wasn't required in your case. What would you recommend?The text was updated successfully, but these errors were encountered: