You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to have a function to convert a rosbag file into a parquet one. I want to load some rosbag datasets into pandas to use the existing data science ecosystem. I have tried to convert these datasets into csv files but I faced some issues decoding image data from some of the messages.
I think we could improve that by creating an extra conversion option, in the rosbag2 CLI, to convert and map the messages to the parquet format without loosing the encoding information, as it happened to me when I converted the rosbag messages to csv.
I believe that this feature could unlock a lot of value and potential for data engineers/scientists because it would be more convenient to work with ros datasets.
Related Issues
Not directly.
I should note that there is a similar issue but I have seen some interest in improving data processing using the parquet format, on ros2/ros2_tracing: issue
Completion Criteria
No completion criteria at the moment.
Implementation Notes / Suggestions
I have no suggestion because I am a new user to ROS but I would be available to implement this feature.
Testing Notes / Suggestions
No suggestions at the moment.
The text was updated successfully, but these errors were encountered:
My initial instinct for this would be to create a new StoragePlugin for parquet format. You wouldn't need to create any core rosbag2 code, the plugin could be independent. Then, you could use ros2 bag convert to convert any existing bag to the new storage format. I don't think you'd run into any technical blockers, it should be totally doable as a standalone plugin library. See https://github.com/ros2/rosbag2/blob/rolling/docs/storage_plugin_development.md for description on plugin development.
These might be a little bit out of date, they were written shortly after the Galactic release, but https://github.com/ros-tooling/rosbag2_sample_plugins provides some example plugins. They're probably mostly correct still though.
Note: while ReadWriteInterface is the way to implement a writer - if you don't want to support reading/playback, you could just throw exception or something from the reader interface, leave it empty as "This is a write-only plugin"
It looks like this work can be tracked in a separate repository with a new parquet storage plugin. In that case, I'll close this issue as not work to be done in this repository. Feel free to comment on it, though.
Description
I would like to have a function to convert a rosbag file into a parquet one. I want to load some rosbag datasets into pandas to use the existing data science ecosystem. I have tried to convert these datasets into csv files but I faced some issues decoding image data from some of the messages.
I think we could improve that by creating an extra conversion option, in the rosbag2 CLI, to convert and map the messages to the parquet format without loosing the encoding information, as it happened to me when I converted the rosbag messages to csv.
I believe that this feature could unlock a lot of value and potential for data engineers/scientists because it would be more convenient to work with ros datasets.
Related Issues
Not directly.
I should note that there is a similar issue but I have seen some interest in improving data processing using the parquet format, on ros2/ros2_tracing:
issue
Completion Criteria
No completion criteria at the moment.
Implementation Notes / Suggestions
I have no suggestion because I am a new user to ROS but I would be available to implement this feature.
Testing Notes / Suggestions
No suggestions at the moment.
The text was updated successfully, but these errors were encountered: