Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds observation term history support to Observation Manager #1439

Merged
merged 11 commits into from
Dec 16, 2024

Conversation

jtigue-bdai
Copy link
Collaborator

@jtigue-bdai jtigue-bdai commented Nov 19, 2024

Description

This PR adds observation history by adding configuration parameters to ObservationTerms and having the ObservationManager handling the collection and storage of the histories via CircularBuffers.

Fixes #1208

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

* add history cfg option to observation manager

* Revert "add history cfg option to observation manager"

This reverts commit 4b00994b8c9ce6e58779150dc42a707b0bc6701f.

* add flatten history option to observation manager

* run formatter

* fix docstring
@jtigue-bdai jtigue-bdai self-assigned this Nov 19, 2024
@jtigue-bdai jtigue-bdai added enhancement New feature or request dev team Issue or pull request created by the dev team labels Nov 19, 2024
@KyleM73
Copy link
Contributor

KyleM73 commented Nov 19, 2024

From the discussion in #1208, this PR introduces observation history tracking per-term. It was discussed that per-group tracking should also be implemented. In theory, the same effect can be had by giving all obs terms the same history length with the config provided in this PR. However, I want to reiterate what I think are the two best arguments for implementing per-group obs histories as well, and would be happy to hear feedback:

  1. When writing deployment code, it is often more straightforward and natural to save the entire observation to a 2D buffer of size (H,D) for H history and D observation dimension. In the current implementation, separate buffers would need to be kept for each observation term as defined by Isaac Lab. However, it may not be trivial to do so, whereas to evaluate RL policies on hardware it is guaranteed that the observations will be stacked into the full length-D array at some point, making saving them as a per-group history easier to replicate on hardware.
  2. Providing an implementation at the group level is a cleaner interface for developers than having to change the history length for each obs term. Especially when subclassing observation group configs from parent classes, it would be much nicer to define the history in the observation group post init (the interface currently used for enable_corruption and concatenate_terms, etc.) as opposed to having to change it for every term. Having used only my own version of the code in this PR for the last ~6 months, I have inadvertently missed changing the history length for one or two obs terms many times, requiring restarting the run. This bug can be avoided by optionally choosing to only change the history length in a single location, namely in the obs group post init.

The PR as-is resolves per-term observation history tracking. If it is decided that Isaac Lab should only support per-term observation histories then I am happy to close my comment/proposal with this PR. Otherwise, I'm happy to help implement per-group history tracking as well.

Thanks for putting this together James!

@jtigue-bdai
Copy link
Collaborator Author

I agree @KyleM73,the current implementation only covers term level that we talked about (in a slightly different implementation). I think having the option for group level is still worthwhile. I also agree we should be able to apply group level history via an overall group history_length. I was thinking, if set, the group level history length would override the term history length within that group.

@jtigue-bdai jtigue-bdai marked this pull request as ready for review November 22, 2024 16:52
@@ -109,7 +121,7 @@ def append(self, data: torch.Tensor):
# at the fist call, initialize the buffer
if self._buffer is None:
self._pointer = -1
self._buffer = torch.empty((self.max_length, *data.shape), dtype=data.dtype, device=self._device)
self._buffer = torch.zeros((self.max_length, *data.shape), dtype=data.dtype, device=self._device)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtigue-bdai Isn't it better to initialize the buffer to the latest data on the first append or right after a reset?

Otherwise, we assume that zero is a valid value for the observation, which may not always be true.
For example, if we are gathering a history of, lets say, the gravitational force, which may be defined as a strictly negative value, having a buffer filled with zeros as previous observations may not be within the expected distribution.

I would suggest that, during reset or at first init, all the indices in the history is initialized to the most recent data, possibly the one being passed into append()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point. I think you are right about filling on the first append.

@@ -14,6 +14,7 @@
from typing import TYPE_CHECKING

from omni.isaac.lab.utils import modifiers
from omni.isaac.lab.utils.buffers import CircularBuffer

from .manager_base import ManagerBase, ManagerTermBase
from .manager_term_cfg import ObservationGroupCfg, ObservationTermCfg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot find how to provide a suggestion to lines that are not part of the changelist, so here we go.

In the method def __str__(self) -> str:,

You may need to re-compute the observation dimensions depending on whether flattening is enabled or not.
I see two deficiencies which can cause confusions:

  1. Currently, the printed summary does not handle history length while computing the observation dimension.
  2. The self._group_obs_dim is wrong and does not correspond to the actual observation dimension if history is used. This is quite critical to be fixed IMO.
  3. The shape of the observation group printed in the summary is wrong.

Hope this helps!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch I will address this.

Comment on lines 361 to 367
# create history buffers
if term_cfg.history_length > 0:
group_entry_history_buffer[term_name] = CircularBuffer(
max_len=term_cfg.history_length, batch_size=self._env.num_envs, device=self._env.device
)
# call function the first time to fill up dimensions
obs_dims = tuple(term_cfg.func(self._env, **term_cfg.params).shape)
Copy link
Contributor

@aravindev aravindev Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# create history buffers
if term_cfg.history_length > 0:
group_entry_history_buffer[term_name] = CircularBuffer(
max_len=term_cfg.history_length, batch_size=self._env.num_envs, device=self._env.device
)
# call function the first time to fill up dimensions
obs_dims = tuple(term_cfg.func(self._env, **term_cfg.params).shape)
# call function the first time to fill up dimensions
obs_dims = tuple(term_cfg.func(self._env, **term_cfg.params).shape)
# create history buffers
if term_cfg.history_length > 0:
group_entry_history_buffer[term_name] = CircularBuffer(
max_len=term_cfg.history_length,
batch_size=self._env.num_envs,
device=self._env.device,
)
obs_dims = (obs_dims[0], term_cfg.history_length * obs_dims[1], *obs_dims[2:])
if term_cfg.flatten_history_dim:
obs_dims = (obs_dims[0], np.prod(obs_dims[1:]),)

This populates the correct _group_obs_term_dim into the dictionary. Later, this is used to compute the _group_obs_dim however, I assume that its computation does not need any changes.

@kousheekc
Copy link

kousheekc commented Dec 13, 2024

Hello, thanks for this PR, having a group level history length is exactly what I needed for my use case. I had implemented a hacky solution myself that did the job but did not notice this PR, so will be switching to this implementation now. However, I was wondering if it may be worth adding a specific task as a demo example to demonstrate this feature. Or would it make more sense to have a separate PR for that? I have some ideas in mind and I would be glad to support for this part if required.

@fan-ziqi
Copy link
Contributor

I also need to use historical observations. Currently, I've implemented historical observations myself in rsl_rl, which is quite cumbersome. Looking forward to the merge of this PR!

@kellyguo11 kellyguo11 changed the title Add observation term history support to Observation Manager Adds observation term history support to Observation Manager Dec 15, 2024
@kellyguo11 kellyguo11 merged commit f7b59b3 into main Dec 16, 2024
4 of 5 checks passed
@kellyguo11 kellyguo11 deleted the jat/feat/obs_term_history branch December 16, 2024 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dev team Issue or pull request created by the dev team enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Proposal] Observation Histories
7 participants