Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a health check API to track the status of Greenlight for better error handling. #477

Open
Nodirbek75 opened this issue Jul 9, 2024 · 1 comment

Comments

@Nodirbek75
Copy link

We are currently using Breez-SDK in our app, which includes a health check API to track the status of the Breez server. However, it doesn't include a Greenlight health check. It would be great to add a health check API for tracking the Greenlight status so that we can implement better error handling in our app.

@cdecker: https://discord.com/channels/899980449231814676/900323634512551946/1193584731900612789

@cdecker
Copy link
Collaborator

cdecker commented Jul 10, 2024

Thanks for reporting this issue, giving it a place to be discussed. This has been requested by several users, and we'd love to provide a status API, however it is not that easy to boil the system status down to a single red or green bubble. Take for example the scalability dimension we chose: nodes. If a single node is having issues, does the system as a whole count as (partially) unavailable, or is it operating as expected? I'd argue it is still within normal operations, but the user whose node is having issues (e.g., missing a signer, failing a payment, failing to reconnect to a peer), the system looks unavailable and unstable.

We will of course add statuses for the core services such as the tower and the scheduler, but those have 99.9+% uptime as we speak, and mostly it's the fleet of user nodes encountering issues individually, which we then fix up asap, but declaring the entire system unavailable because a small subset of users is having a sub-optimal time doesn't help others, as the experiences are rather subjective due to the separation between tenants on our system.

TL;DR: we need to define the semantics of available / unavailable for the nodes before we can provide a useful status indication for GL as a whole.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants