-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(store/v2): Support Custom WriteOpts PebbleDB #18195
Conversation
WalkthroughThe changes introduced provide more control over the synchronization behavior of batch writes in the PebbleDB storage system. A Changes
TipsChat with CodeRabbit Bot (
|
@alexanderbez your pull request is missing a changelog! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Files selected for processing (4)
- store/storage/pebbledb/batch.go (3 hunks)
- store/storage/pebbledb/db.go (4 hunks)
- store/storage/pebbledb/db_test.go (1 hunks)
- store/storage/storage_bench_test.go (1 hunks)
Additional comments (Suppressed): 13
store/storage/pebbledb/db_test.go (2)
20-20: The
New
function now requires async
parameter. Ensure that all calls to this function throughout the codebase have been updated to match the new signature. Also, consider the implications of settingsync
tofalse
in this test. If the test is meant to simulate real-world usage, it might be more appropriate to setsync
totrue
.32-32: The
New
function now requires async
parameter. Ensure that all calls to this function throughout the codebase have been updated to match the new signature. Also, consider the implications of settingsync
tofalse
in this test. If the test is meant to simulate real-world usage, it might be more appropriate to setsync
totrue
.store/storage/storage_bench_test.go (1)
- 26-26: The
pebbledb.New
function now requires a second parameter to specify whether write operations should be synchronous or asynchronous. Ensure that this change does not affect the behavior of the benchmark tests, and that the choice of synchronous or asynchronous writes is appropriate for the tests.- return pebbledb.New(dataDir) + return pebbledb.New(dataDir, false)store/storage/pebbledb/batch.go (3)
22-25: The function signature of
NewBatch
has been changed to include async
parameter. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.33-36: The
Batch
struct now includes async
field. This is a good addition as it allows the caller to control whether write operations should be synchronous or asynchronous.69-72: The
Write
method now uses thesync
field to set theSync
option when committing the batch. This is a good change as it allows the caller to control the synchronization behavior of write operations. However, please ensure that this change does not introduce any unexpected behavior in the rest of the codebase.store/storage/pebbledb/db.go (7)
24-24: The assertion that
Database
implementsstore.VersionedDatabase
is good practice to ensure that all required methods are implemented. This will cause a compile-time error if theDatabase
struct does not satisfy thestore.VersionedDatabase
interface.26-39: The addition of the
sync
field to theDatabase
struct is a significant change. This field will control whether write operations are synchronous or asynchronous, which can have a major impact on the performance and durability of the database. The comments provide a clear explanation of the purpose and implications of this field.41-44: The
New
function now accepts async
parameter, which is used to set thesync
field in theDatabase
struct. This allows the caller to control the synchronization behavior of the database at the time of creation.54-54: The
sync
field is correctly set in theDatabase
struct using thesync
parameter passed to theNew
function.58-62: The
NewWithDB
function now also accepts async
parameter, which is used to set thesync
field in theDatabase
struct. This allows the caller to control the synchronization behavior of the database when creating aDatabase
instance with an existing PebbleDB database.74-74: The
SetLatestVersion
method now uses thesync
field to set theSync
option when writing to the database. This allows the synchronization behavior of the write operation to be controlled by thesync
field in theDatabase
struct.144-144: The
ApplyChangeset
method now passes thesync
field to theNewBatch
function. This allows the synchronization behavior of the batch to be controlled by thesync
field in theDatabase
struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would this get used? I dont think we should allow such a flag as it seems like it could cause other issues in shutdown
i do think there are other options we should look into configuring, The issue we had with current dbs was not looking deeper to get more performance. We should openan issue to test against our workloads and configure based on that. This change is based off small benchmarks which will show better results but its still not what we need for our use case
When the app instantiates the SS PebbleDB backend (assuming it's using PebbleDB).
I totally agree with you. I just wanted to give optionality. Sei, for example, would 100% use sync=false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (2)
- store/storage/pebbledb/db.go (4 hunks)
- store/storage/pebbledb/db_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
- store/storage/pebbledb/db_test.go
Additional comments: 5
store/storage/pebbledb/db.go (5)
24-38: The
sync
field has been added to theDatabase
struct. This field determines whether writes to the database are synchronous or asynchronous. The comments provide a clear explanation of the implications of setting this field to true or false. This is a good practice as it helps other developers understand the purpose and impact of this field.54-55: In the
New
function, thesync
field is set totrue
by default. This means that all writes to the database will be synchronous unless specified otherwise. This is a safe default setting as it ensures data durability.58-61: The
NewWithDB
function now accepts async
parameter. This allows the caller to specify whether writes to the database should be synchronous or asynchronous. This is a good addition as it provides more control over the database's write behavior.74-74: The
SetLatestVersion
function now uses thesync
field when setting the write options. This means that the write operation will be synchronous or asynchronous based on the value of thesync
field. This is a good change as it allows the write behavior to be controlled on a per-database basis.144-144: The
ApplyChangeset
function now passes thesync
field to theNewBatch
function. This means that the write behavior of the batch operation will be determined by the value of thesync
field. This is a good change as it allows the write behavior to be controlled on a per-batch basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (2)
- store/storage/pebbledb/db.go (4 hunks)
- store/storage/pebbledb/db_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
- store/storage/pebbledb/db_test.go
Additional comments: 5
store/storage/pebbledb/db.go (5)
24-38: The
sync
field has been added to theDatabase
struct to control the synchronization behavior of write operations. This is a significant change that can affect the durability and performance of write operations. Ensure that this change is well-documented and communicated to users of thepebbledb
package.54-55: The
New
function now initializes thesync
field totrue
by default, meaning that write operations will be synchronous by default. This could potentially slow down write operations, but it will ensure their durability.58-61: The
NewWithDB
function now accepts async
parameter to control the synchronization behavior of write operations. This is a significant change that can affect the durability and performance of write operations. Ensure that this change is well-documented and communicated to users of thepebbledb
package.74-74: The
SetLatestVersion
method now uses thesync
field to set theSync
option in apebble.WriteOptions
struct. This means that the synchronization behavior of theSetLatestVersion
method will depend on the value of thesync
field.144-144: The
ApplyChangeset
method now accepts async
parameter, which is passed to theNewBatch
function. This means that the synchronization behavior of theApplyChangeset
method will depend on the value of thesync
field.
Hey @alexanderbez @tac0turtle can we please backport this to the |
Hi @GAtom22 -- these changes are in respect to the store v2 work which is yet unreleased. I.e. this code does not exist on v0.47.x. What specifically do you want to see on v0.47.x? |
Ahh, nvm, my bad |
Description
Closes: #18178
Turns out, the default PebbleDB opts are sufficiently fine and show no major difference when tuned. The write options on batch writes show significant improvement in writes.
This PR allows apps to set the sync (write options).
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
to the type prefix if API or client breaking changeCHANGELOG.md
make lint
andmake test
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...
!
in the type prefix if API or client breaking changeSummary by CodeRabbit
Please note, this update does not directly affect the user interface but enhances the underlying data handling capabilities of the software.