Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to drop MV or sink during recovery #12999

Closed
fuyufjh opened this issue Oct 23, 2023 · 6 comments · Fixed by #12317
Closed

Allow user to drop MV or sink during recovery #12999

fuyufjh opened this issue Oct 23, 2023 · 6 comments · Fixed by #12317
Assignees
Milestone

Comments

@fuyufjh
Copy link
Member

fuyufjh commented Oct 23, 2023

Is your feature request related to a problem? Please describe.

We have met for a couple of times that users created a problematic MV, the cluster entered into endless crash loop, but he can hardly drop the MV.

Our temporary solution is let users:

  1. Set system parameter: ALTER SYSTEM SET pause_on_next_bootstrap to true
  2. Restart the meta service.
  3. Drop the relevant mviews.
  4. Restart again or use ctl to resume.

However, most users don't know about this. They feel very frustrated about the error message "cluster is under recovery"

Is it possible to allow user to drop MV or sink during recovery?

Describe the solution you'd like

From my understanding, during cluster recovery, we can just delete the metadata of specified MV/Sink without caring about actors, because actors will naturally disappear in next time of scheduling.

Describe alternatives you've considered

If the above way is too hard to work out, we might also put the "temporary solution" into offical doc or even error message.

Additional context

No response

@github-actions github-actions bot added this to the release-1.4 milestone Oct 23, 2023
@BugenZhao
Copy link
Member

BugenZhao commented Oct 26, 2023

We had a discussion before: #12203 (and PR #12317), and the temporary conclusion is that "safe mode" is sufficient. The only potential problem is that "safe mode" still requires the executor to be initialized correctly: if some executor triggers OOM before awaiting on the first barrier, then there's still no way to drop as we rely on the barrier to drop a job.

we can just delete the metadata of specified MV/Sink without caring about actors, because actors will naturally disappear in next time of scheduling

This is true, but it will require implementing a new code path. I agree that this could be a beneficial improvement to consider in the future.

@fuyufjh
Copy link
Member Author

fuyufjh commented Nov 14, 2023

...and the temporary conclusion is that "safe mode" is sufficient. The only potential problem is that "safe mode" still requires the executor to be initialized correctly: if some executor triggers OOM before awaiting on the first barrier, then there's still no way to drop as we rely on the barrier to drop a job.

The "potential problem" actually sounds acceptable to me.

Instead, my major motivation is to make this available to users by simply drop materialized view, instead of doing the secret ALTER SYSTEM SET pause_on_next_bootstrap to true

Is it possible to leverage the existent implementation to make drop materialized view work during recovery?

@BugenZhao
Copy link
Member

I guess first we need to define what is "during recovery".

From our perspective, as long as it's not the case of "insufficient parallel units", recovery should be fast enough as the first barrier should always be propagated instantly. So in the most cases, DROP is called on a recovered pipeline. It's not that we don't allow users to drop MV during the recovery, but the Stop mutation is executed slowly on a problematic pipeline.

As a result, even if we have implemented the "metadata cleaning" approach, it may not be hit most of the time as the state of "during recovery" does not last long. If the recovery already "succeeds", another recovery might be necessary to get it to take effect.

From users' perspective, things get much simpler: they just want to get the job dropped as fast as possible, no matter what we do behind that. Therefore, I guess the ultimate goal of this issue is to make dropping more responsive, especially when there's performance issue or failure.

@fuyufjh
Copy link
Member Author

fuyufjh commented Nov 15, 2023

From users' perspective, things get much simpler: they just want to get the job dropped as fast as possible, no matter what we do behind that. Therefore, I guess the ultimate goal of this issue is to make dropping more responsive, especially when there's performance issue or failure.

Totally agree with you. And I believe the "metadata cleaning" approach may be not a good idea.

So do you have any ideas about this question?

Is it possible to leverage the existent implementation to make drop materialized view work during recovery?

By "existent implementation" I mean:

  1. Set system parameter: ALTER SYSTEM SET pause_on_next_bootstrap to true
  2. Restart the meta service.
  3. Drop the relevant mviews.
  4. Restart again or use ctl to resume.

By "to make drop materialized view work during recovery" I mean: automate these things above, so that user can simply run a drop mv and we do these dark work for them.

@fuyufjh
Copy link
Member Author

fuyufjh commented Nov 15, 2023

So here is my naive idea 🤣 I am thinking that, instead of returning the "cluster is under recovery" error to users, can we enhance this code branch like:

  • Set a boolean flag pause_on_next_recovery = true
  • Once it recovers again, after emitting the first barrier and before emitting any data events, do the drop i.e. emitting the mutation barrier

@BugenZhao
Copy link
Member

See updated discussion here: #12203 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants