Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task scheduling API #25

Closed
wants to merge 5 commits into from
Closed

Conversation

luca-della-vedova
Copy link
Member

New feature implementation

Implemented feature

Closes #16.

This PR adds an API to register / remove task schedules.
The implementation on the backend side will be using the rmf_scheduler node, which has (internal) API in the rmf_scheduler_msgs.

Implementation description

Following both the issue and the rmf_scheduler implementation, this API uses cron strings to describe the schedule for a straightforward integration with the scheduler.
An alternative approach could involve having a more readable API, at the expense of having to add a conversion layer to convert it to a cron string.

Services for adding and deleting task schedules have been provided. There should also probably be a "list" service that can be used by frontend / users to read what the current schedules are. This is especially important since schedules are recurring tasks that are persistent and could last indefinitely (i.e. years). They can only be deleted or updated by knowing their unique name so if the API doesn't let the user fetch a list of schedules it might be impossible to update a schedule if the user lost its original name.

Copy link
Contributor

@koonpeng koonpeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We need at least 3 copies of the same data, in rmf_scheduler, in task scheduler and in api server, so some kind of synchronization is a must.
    • I realize the endpoints here and in rmf_scheduler are insufficient to do proper synchronization. rmf_scheduler is missing delete events (which would be simple to add) and the endpoints here is missing query capabilities, this means that the task scheduler and api server must download the entire database periodically.
  • Since rmf_scheduler is an optional component, the task scheduler must have a way to determine if these endpoints are available.
  • This would also be a good time to think about authorization, if we do it at the api server, we can only do coarse authz based on roles (i.e. a user with create task role will be able to create/schedule any task regardless of the location, type, fleet etc), fine grained authorization based on the task details can only be done by the task scheduler so we need to have authz data in these schemas.

rmf_api_msgs/schemas/task_schedule.json Outdated Show resolved Hide resolved
rmf_api_msgs/schemas/task_schedule_remove_request.json Outdated Show resolved Hide resolved
@luca-della-vedova
Copy link
Member Author

Thanks for the feedback!

  • We need at least 3 copies of the same data, in rmf_scheduler, in task scheduler and in api server, so some kind of synchronization is a must.

    • I realize the endpoints here and in rmf_scheduler are insufficient to do proper synchronization. rmf_scheduler is missing delete events (which would be simple to add) and the endpoints here is missing query capabilities, this means that the task scheduler and api server must download the entire database periodically.

I guess my concern for downstream users is that they would have to subscribe to an "event" topic (i.e. for delete events as you mentioned) and make sure no messages are lost, otherwise the databases will not be in sync.
What would the use case for multiple copies of the same data? From what I understood there could be:

  • Frontend user for UI: When the page is loaded do a call to get the latest state of task schedules so it can be displayed in the UI. I do agree that it would be nice to have a "schedule event" subscription so we could have live updates of the webpage, without having to redownload the whole schedule database.
  • RMF traffic schedule: It should be agnostic to the database (at least for now) and only see the task when it gets published (again at least for now). In the future we could publish it "a bit in advance" to allow the traffic scheduler to plan for it better and if necessary keep its own database that could be synced in a way similar to above.
  • Any external user that wants to see the state of the task schedules, a "get task state API" should suffice for them.
  • RMF scheduler itself: It keeps the actual database.

I thought api server wouldn't need to keep the data and it can just call the matching ROS2 services when an API call is made

  • Since rmf_scheduler is an optional component, the task scheduler must have a way to determine if these endpoints are available.

I was thinking of using internally the ROS2 service system, where the API server can check if the rmf_scheduler service servers are available and only offer the endpoints if they are.

  • This would also be a good time to think about authorization, if we do it at the api server, we can only do coarse authz based on roles (i.e. a user with create task role will be able to create/schedule any task regardless of the location, type, fleet etc), fine grained authorization based on the task details can only be done by the task scheduler so we need to have authz data in these schemas.

I have no idea about this sadly, I agree that authorization is an important factor, my guess is that it would apply to everything, not only task schedules, so the authorization data would need to be everywhere, probably as a key in each of the schemas that the api server offers?

@koonpeng
Copy link
Contributor

Regarding multiple copies of data, it is a common side effect of the CQRS pattern (somewhat similar to the architecture we have). Because each service needs different sets of data, they usually end up maintaining their own copies. An example is authorization data, rmf scheduler has no support for authorization so it doesn't store the relevant data, so then some other services on top must store and link the authz data to the schedule data. The api server also need to support querying and paging, so unless the backend service also supports them, it must maintain its own database.

Only having one database can easily cause a cascading effect where the lowest layer must store and maintain data for all upper layers. It couples the lower layer api/data contract with the upper layers, making it impossible to do changes without also changing the whole stack, which also creates many versioning, compatibiliy, testing etc problems (because the lower layer never uses these data, but it must maintain api / database compatibility for them).

I was thinking of using internally the ROS2 service system, where the API server can check if the rmf_scheduler service servers are available and only offer the endpoints if they are.

A problem with this is the good old startup race condition issue which can be solved by polling, but we also need to make a distinction between "unavailable" and "not enabled" which conflicts with the polling algorithm. Another option is to always assume rmf scheduler is enabled and just have a time out, minor downside is that we can't tell from the logs the difference between "temporary down" and "never deployed".

I feel that scheduling tasks sounds like quite a "core" feature, it sounds like something that should be available "by default". But then it would also make rmf scheduler non-optional. It wouldn't be a problem if we only add task scheduling to a future major release without foxy support, so I'm not sure if we should bother with optional task scheduling.

@luca-della-vedova
Copy link
Member Author

I believe there is a different upstream implementation so will close this

@luca-della-vedova luca-della-vedova deleted the feature/task_schedule_api branch April 22, 2024 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Schema for recurring tasks
2 participants