You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Currently the compaction job configuration is based on a range of entries. An individual entry in the queue can vary in size based on the number files in the tablet and the number of files in the compaction job. So it is hard to reason about entires. The goal of limiting the size is to limit memory usage.
Describe the solution you'd like
Have a single configuration that is a memory upper limit for compaction job queues. For example the configuration would allow the queue to use up to 50M of memory. This would be much easier to understand and would work much better at limiting memory used by the queue. The current configuration based on a range of entries sizes (like the queue can range from 10 to 10000) entries does not control memory usage in a predictable way.
The text was updated successfully, but these errors were encountered:
How do you envision computing the memory for the compaction job? Caffeine has a nice Weigher API for computing the weight of an entry to control max size based on memory instead of total count if the memory footprint for each entry can vary so maybe we can do something similar here.
How do you envision computing the memory for the compaction job? Caffeine has a nice Weigher API for computing the weight of an entry to control max size based on memory instead of total count if the memory footprint for each entry can vary so maybe we can do something similar here.
Would write some custom code to compute a data size estimate of the object similar to what would be done in the implementation of a Weigher
Should #5188 be implemented first before we do this? The MetaJob object stores both the CompactionJob and TabletMetadata now so if we drop the TabletMetadata thenthere will be some refactoring and will impact the code here. This issue also becomes a lot easier if we don't have to try and estimate the TabletMetadata size.
Should #5188 be implemented first before we do this? The MetaJob object stores both the CompactionJob and TabletMetadata now so if we drop the TabletMetadata thenthere will be some refactoring and will impact the code here. This issue also becomes a lot easier if we don't have to try and estimate the TabletMetadata size.
If this is done first should not expend much on effort on computing the tabletmetadata size. Could call TabletMetadata.toString().lenght() to estimate its size. This is not efficient, but it is quick to write the code for something that will go away.
Is your feature request related to a problem? Please describe.
Currently the compaction job configuration is based on a range of entries. An individual entry in the queue can vary in size based on the number files in the tablet and the number of files in the compaction job. So it is hard to reason about entires. The goal of limiting the size is to limit memory usage.
Describe the solution you'd like
Have a single configuration that is a memory upper limit for compaction job queues. For example the configuration would allow the queue to use up to 50M of memory. This would be much easier to understand and would work much better at limiting memory used by the queue. The current configuration based on a range of entries sizes (like the queue can range from 10 to 10000) entries does not control memory usage in a predictable way.
The text was updated successfully, but these errors were encountered: