Tracking issue for compaction offloading #1545

LeslieKid · 2024-07-17T15:51:17Z

Describe This Problem

We found in production that the speed of sst compaction is unable to keep up with the speed of sst generation, leading to poor query performance. However we are unable give more resource to compaction to solve the problem because query/write is more important than compaction in the same node.

It is really hard to do a trade-off about resource allocation among query, write and compaction in lsm model. We want to compact the generated small ssts as fast as possible, but we can't tolerate its influence to query/write. And finally I think offload the compaction to the seperated nodes may be the key for it.

Proposal

The following is the architecture for compaction offloading.

To support compaction offloading, we need:

Horaedb node supports submitting the compaction task to remote feat: support horaedb submit compaction task to remote #1563
- Introduce runnable compaction node and expose the api for horaedb node to ask for remote compaction service (Execute Plane)
- Impl remote mode compactor supporting compaction offload
Horaemeta supports managing the compaction nodes (Control Plane)
- Impl the ability to monitor the compaction servers (Monitor)
- Impl scheduling algorithm for load balance and expose the api for horaedb node to get the proper remote compaction node (Scheduler)
Integration tests for compaction test: add integration test for compaction offload #1573
- See Integration test for compaction offload #1571 for more details

Additional Context

This issue replaces issue #1480. Please close issue #1480 as it is outdated.

incubator-horaedb-proto#133 is highly related to this issue.

Rachelint · 2024-07-17T16:24:29Z

looks great!

Rachelint · 2024-07-29T13:35:22Z

I think in the first stage, we can treat the compaction cluster as same as the horaedb cluster.
And we just reuse the ability about monitoring and scheduling for horaedb cluster in horaemeta.

And we can start to desigin the specific schedule strategy for compaction cluster after it can work.

LeslieKid · 2024-07-31T07:48:07Z

I think in the first stage, we can treat the compaction cluster as same as the horaedb cluster. And we just reuse the ability about monitoring and scheduling for horaedb cluster in horaemeta.

And we can start to desigin the specific schedule strategy for compaction cluster after it can work.

Thanks for your suggestion! I think it's a good idea to first implement a simple working version.

Rachelint · 2024-09-05T04:21:44Z

I think Horaedb node supports submitting the compaction task to remote may be similar as Compaction node supporting remote compaction service .

I guess the third step can be integration test, test is very important actually.

Rachelint · 2024-09-09T02:10:24Z

Maybe you will be insterested when writing test
https://github.com/apache/horaedb/tree/main/integration_tests/dist_query

## Rationale The subtask to support compaction offloading. See #1545 ## Detailed Changes **Compaction node support remote compaction service** - Define `CompactionServiceImpl` to support compaction rpc service. - Introduce `NodeType` to distinguish compaction node and horaedb node. Enable the deployment of compaction node. - Impl `compaction_client` for horaedb node to access remote compaction node. **Horaedb node support compaction offload** - Introduce `compaction_mode` in analytic engine's `Config` to determine whether exec compaction offload or not. - Define `CompactionNodePicker` trait, supporting get remote compaction node info. - Impl `RemoteCompactionRunner`, supporting pick remote node and pass compaction task to the node. - Add docs (e.g. `example-cluster-n.toml`) to explain how to deploy a cluster supporting compaction offload. ## Test Plan --------- Co-authored-by: kamille <[email protected]>

Rachelint assigned LeslieKid Jul 17, 2024

LeslieKid mentioned this issue Jul 19, 2024

feat: add compaction server supporting remote compaction service #1547

Closed

4 tasks

This was referenced Aug 21, 2024

feat: enable horaemeta to monitor compaction nodes. #1555

Closed

feat(horaemeta): impl compaction nodes management service #1559

Closed

LeslieKid mentioned this issue Sep 3, 2024

feat: support horaedb submit compaction task to remote #1563

Merged

LeslieKid mentioned this issue Sep 20, 2024

Integration test for compaction offload #1571

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue for compaction offloading #1545

Tracking issue for compaction offloading #1545

LeslieKid commented Jul 17, 2024 •

edited

Loading

Rachelint commented Jul 17, 2024

Rachelint commented Jul 29, 2024 •

edited

Loading

LeslieKid commented Jul 31, 2024

Rachelint commented Sep 5, 2024

Rachelint commented Sep 9, 2024

Tracking issue for compaction offloading #1545

Tracking issue for compaction offloading #1545

Comments

LeslieKid commented Jul 17, 2024 • edited Loading

Describe This Problem

Proposal

Additional Context

Rachelint commented Jul 17, 2024

Rachelint commented Jul 29, 2024 • edited Loading

LeslieKid commented Jul 31, 2024

Rachelint commented Sep 5, 2024

Rachelint commented Sep 9, 2024

LeslieKid commented Jul 17, 2024 •

edited

Loading

Rachelint commented Jul 29, 2024 •

edited

Loading