-
Notifications
You must be signed in to change notification settings - Fork 255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jazzy] Improve the reliability of rosbag2 tests (backport #1796) #1806
Conversation
Cherry-pick of 2d4d02f has failed:
To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
* Remove wait_until_shutdown. This has almost exactly the same functionality as wait_for_condition, except for two things: 1. It is templated on the Timeout type. 2. It calls rclcpp::shutdown after the loop completes. However, neither of those is necessary; all callers to it use a std::chrono::duration, and all of the test fixtures already call rclcpp::shutdown. Thus, just remove it and make all callers use wait_for_condition instead. Signed-off-by: Chris Lalancette <[email protected]> * Shutdown the async spinner node without rclcpp::shutdown. That is, we really don't actually want to do a full rclcpp shutdown here; we only want to stop spinning. Accomplish that with an executor, and timing out every 100 milliseconds to check if we are done yet. Signed-off-by: Chris Lalancette <[email protected]> * Small fixes to start_async_spin in rosbag2_tests. Make sure it only spins as long as we haven't shutdown, and that it wakes up every so often to check that fact. Signed-off-by: Chris Lalancette <[email protected]> * Wait for topics to be discovered during recorder->record(). The main reason for that is that these tests generally want to test certain expectations around how many messages were received. However, if discovery takes longer than we expect, then it could be the case that we "missed" messages at the beginning because discovery hadn't yet completed. Fix this by just waiting around for the recorder to get all the subscriptions it expects before moving on with the test. Signed-off-by: Chris Lalancette <[email protected]> * Feedback from review. Signed-off-by: Chris Lalancette <[email protected]> * Switch to using MockRecorder. Signed-off-by: Chris Lalancette <[email protected]> * Fixes from review. Signed-off-by: Chris Lalancette <[email protected]> * Feedback from review. Signed-off-by: Chris Lalancette <[email protected]> * Apply suggestions from code review Co-authored-by: Michael Orlov <[email protected]> Signed-off-by: Chris Lalancette <[email protected]> * Switch to using spin, rather than spin_some. That's because there is currently at least one bug associated with spin_some in rclcpp. However, it turns out that we don't even need to use it, as we can just as easily use spin() along with exec.cancel(). Signed-off-by: Chris Lalancette <[email protected]> * Make sure to stop_spinning when we tear down the test. Signed-off-by: Chris Lalancette <[email protected]> * Use scopes to shutdown spinning. Signed-off-by: Chris Lalancette <[email protected]> * Nested contexts just to explicitly cleanup the async spinners. Signed-off-by: Chris Lalancette <[email protected]> * Update rosbag2_transport/test/rosbag2_transport/record_integration_fixture.hpp Co-authored-by: Michael Orlov <[email protected]> Signed-off-by: Chris Lalancette <[email protected]> * Apply the same fix to rosbag2_tests. Signed-off-by: Chris Lalancette <[email protected]> --------- Signed-off-by: Chris Lalancette <[email protected]> Co-authored-by: Michael Orlov <[email protected]>
b90c88d
to
60f1042
Compare
Pulls: #1806 |
The redhat build finished with warnings no CONNEXTDDS_DIR nor NDDSHOME specified. However, this is unrelated to this PR. Merging then. |
I initially started this series trying to track down a rare failing flakey test with
test_record__rmw_cyclonedds_cpp
. That particular flake seems to be able to happen because sometimes discovery takes longer than we expect, and it is possible that the tests "miss" the first publication. If that's the case, then the rest of the test may fail because it is expecting a certai number of messages. Along the way, we cleanup the tests a bit:wait_until_shutdown
method, which was almost exactly the same aswait_for_condition
.rclcpp::shutdown
.recorder->record()
. This ensures we can't get into the above situation.With this series in place, the particular flake of
test_record
onrmw_cyclonedds_cpp
is fixed (or, at least, I can no longer reproduce it). There are still other flakes that can happen under load, but I think fixes for those will have to come separately.This is an automatic backport of pull request #1796 done by Mergify.