Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(store/v2): Support Custom WriteOpts PebbleDB #18195

Merged
merged 3 commits into from
Oct 24, 2023

Conversation

alexanderbez
Copy link
Contributor

@alexanderbez alexanderbez commented Oct 20, 2023

Description

Closes: #18178

Turns out, the default PebbleDB opts are sufficiently fine and show no major difference when tuned. The write options on batch writes show significant improvement in writes.

This PR allows apps to set the sync (write options).


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.

I have...

  • included the correct type prefix in the PR title
  • added ! to the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • followed the guidelines for building modules
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • included comments for documenting Go code
  • updated the relevant documentation or specification
  • reviewed "Files changed" and left comments if necessary
  • run make lint and make test
  • confirmed all CI checks have passed

Reviewers Checklist

All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.

I have...

  • confirmed the correct type prefix in the PR title
  • confirmed ! in the type prefix if API or client breaking change
  • confirmed all author checklist items have been addressed
  • reviewed state machine logic
  • reviewed API design and naming
  • reviewed documentation is accurate
  • reviewed tests and test coverage
  • manually tested (if applicable)

Summary by CodeRabbit

  • New Feature: Introduced an option to control data synchronization during batch writes and database operations. This allows users to choose between synchronous and asynchronous writes, providing a balance between data safety and performance.
  • Test: Updated test functions to accommodate the new synchronization feature. This ensures that our tests accurately reflect the new functionality and maintain the software's reliability.

Please note, this update does not directly affect the user interface but enhances the underlying data handling capabilities of the software.

@alexanderbez alexanderbez marked this pull request as ready for review October 20, 2023 21:55
@alexanderbez alexanderbez requested a review from a team as a code owner October 20, 2023 21:55
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 20, 2023

Walkthrough

The changes introduced provide more control over the synchronization behavior of batch writes in the PebbleDB storage system. A sync field has been added to the Batch and Database structs, and corresponding functions have been updated to accept a sync parameter. This allows for the choice between synchronous and asynchronous batch writes.

Changes

File Summary
store/storage/pebbledb/batch.go Introduced a sync field to the Batch struct and updated the NewBatch and Write functions to handle the sync parameter.
store/storage/pebbledb/db.go Removed defaultWriteOpts, added sync field to Database struct, and updated New, NewWithDB, SetLatestVersion, and ApplyChangeset functions to handle the sync parameter.
store/storage/pebbledb/db_test.go & store/storage/storage_bench_test.go Updated function calls to New to include the sync parameter.

🐇
"In the land of code, where logic is king,
A rabbit hopped forth, changes to bring.
With a twitch of the nose, and a flick of the ear,
Synchronization options now appear.
Now write with ease, sync or not,
The choice is yours, thanks to the rabbit's thought!" 🎉


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.json

@github-actions
Copy link
Contributor

@alexanderbez your pull request is missing a changelog!

@github-prbot github-prbot requested review from a team, likhita-809 and testinginprod and removed request for a team October 20, 2023 21:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0

Commits Files that changed from the base of the PR and between 2fbc547 and 9caedec.
Files selected for processing (4)
  • store/storage/pebbledb/batch.go (3 hunks)
  • store/storage/pebbledb/db.go (4 hunks)
  • store/storage/pebbledb/db_test.go (1 hunks)
  • store/storage/storage_bench_test.go (1 hunks)
Additional comments (Suppressed): 13
store/storage/pebbledb/db_test.go (2)
  • 20-20: The New function now requires a sync parameter. Ensure that all calls to this function throughout the codebase have been updated to match the new signature. Also, consider the implications of setting sync to false in this test. If the test is meant to simulate real-world usage, it might be more appropriate to set sync to true.

  • 32-32: The New function now requires a sync parameter. Ensure that all calls to this function throughout the codebase have been updated to match the new signature. Also, consider the implications of setting sync to false in this test. If the test is meant to simulate real-world usage, it might be more appropriate to set sync to true.

store/storage/storage_bench_test.go (1)
  • 26-26: The pebbledb.New function now requires a second parameter to specify whether write operations should be synchronous or asynchronous. Ensure that this change does not affect the behavior of the benchmark tests, and that the choice of synchronous or asynchronous writes is appropriate for the tests.
- return pebbledb.New(dataDir)
+ return pebbledb.New(dataDir, false)
store/storage/pebbledb/batch.go (3)
  • 22-25: The function signature of NewBatch has been changed to include a sync parameter. Ensure that all calls to this function throughout the codebase have been updated to match the new signature.

  • 33-36: The Batch struct now includes a sync field. This is a good addition as it allows the caller to control whether write operations should be synchronous or asynchronous.

  • 69-72: The Write method now uses the sync field to set the Sync option when committing the batch. This is a good change as it allows the caller to control the synchronization behavior of write operations. However, please ensure that this change does not introduce any unexpected behavior in the rest of the codebase.

store/storage/pebbledb/db.go (7)
  • 24-24: The assertion that Database implements store.VersionedDatabase is good practice to ensure that all required methods are implemented. This will cause a compile-time error if the Database struct does not satisfy the store.VersionedDatabase interface.

  • 26-39: The addition of the sync field to the Database struct is a significant change. This field will control whether write operations are synchronous or asynchronous, which can have a major impact on the performance and durability of the database. The comments provide a clear explanation of the purpose and implications of this field.

  • 41-44: The New function now accepts a sync parameter, which is used to set the sync field in the Database struct. This allows the caller to control the synchronization behavior of the database at the time of creation.

  • 54-54: The sync field is correctly set in the Database struct using the sync parameter passed to the New function.

  • 58-62: The NewWithDB function now also accepts a sync parameter, which is used to set the sync field in the Database struct. This allows the caller to control the synchronization behavior of the database when creating a Database instance with an existing PebbleDB database.

  • 74-74: The SetLatestVersion method now uses the sync field to set the Sync option when writing to the database. This allows the synchronization behavior of the write operation to be controlled by the sync field in the Database struct.

  • 144-144: The ApplyChangeset method now passes the sync field to the NewBatch function. This allows the synchronization behavior of the batch to be controlled by the sync field in the Database struct.

Copy link
Member

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where would this get used? I dont think we should allow such a flag as it seems like it could cause other issues in shutdown

i do think there are other options we should look into configuring, The issue we had with current dbs was not looking deeper to get more performance. We should openan issue to test against our workloads and configure based on that. This change is based off small benchmarks which will show better results but its still not what we need for our use case

@tac0turtle tac0turtle disabled auto-merge October 21, 2023 09:15
@alexanderbez
Copy link
Contributor Author

Where would this get used?

When the app instantiates the SS PebbleDB backend (assuming it's using PebbleDB).

I dont think we should allow such a flag as it seems like it could cause other issues in shutdown

I totally agree with you. I just wanted to give optionality. Sei, for example, would 100% use sync=false.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 9caedec and 068ef4e.
Files selected for processing (2)
  • store/storage/pebbledb/db.go (4 hunks)
  • store/storage/pebbledb/db_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
  • store/storage/pebbledb/db_test.go
Additional comments: 5
store/storage/pebbledb/db.go (5)
  • 24-38: The sync field has been added to the Database struct. This field determines whether writes to the database are synchronous or asynchronous. The comments provide a clear explanation of the implications of setting this field to true or false. This is a good practice as it helps other developers understand the purpose and impact of this field.

  • 54-55: In the New function, the sync field is set to true by default. This means that all writes to the database will be synchronous unless specified otherwise. This is a safe default setting as it ensures data durability.

  • 58-61: The NewWithDB function now accepts a sync parameter. This allows the caller to specify whether writes to the database should be synchronous or asynchronous. This is a good addition as it provides more control over the database's write behavior.

  • 74-74: The SetLatestVersion function now uses the sync field when setting the write options. This means that the write operation will be synchronous or asynchronous based on the value of the sync field. This is a good change as it allows the write behavior to be controlled on a per-database basis.

  • 144-144: The ApplyChangeset function now passes the sync field to the NewBatch function. This means that the write behavior of the batch operation will be determined by the value of the sync field. This is a good change as it allows the write behavior to be controlled on a per-batch basis.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 0
Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between 068ef4e and 941d03a.
Files selected for processing (2)
  • store/storage/pebbledb/db.go (4 hunks)
  • store/storage/pebbledb/db_test.go (1 hunks)
Files skipped from review due to trivial changes (1)
  • store/storage/pebbledb/db_test.go
Additional comments: 5
store/storage/pebbledb/db.go (5)
  • 24-38: The sync field has been added to the Database struct to control the synchronization behavior of write operations. This is a significant change that can affect the durability and performance of write operations. Ensure that this change is well-documented and communicated to users of the pebbledb package.

  • 54-55: The New function now initializes the sync field to true by default, meaning that write operations will be synchronous by default. This could potentially slow down write operations, but it will ensure their durability.

  • 58-61: The NewWithDB function now accepts a sync parameter to control the synchronization behavior of write operations. This is a significant change that can affect the durability and performance of write operations. Ensure that this change is well-documented and communicated to users of the pebbledb package.

  • 74-74: The SetLatestVersion method now uses the sync field to set the Sync option in a pebble.WriteOptions struct. This means that the synchronization behavior of the SetLatestVersion method will depend on the value of the sync field.

  • 144-144: The ApplyChangeset method now accepts a sync parameter, which is passed to the NewBatch function. This means that the synchronization behavior of the ApplyChangeset method will depend on the value of the sync field.

@tac0turtle tac0turtle added this pull request to the merge queue Oct 24, 2023
Merged via the queue into main with commit 90723b7 Oct 24, 2023
58 checks passed
@tac0turtle tac0turtle deleted the bez/store-v2-pebbledb-sync-opt branch October 24, 2023 12:29
@GAtom22
Copy link
Contributor

GAtom22 commented Oct 30, 2023

Hey @alexanderbez @tac0turtle can we please backport this to the release/v0.47.x branch?

@alexanderbez
Copy link
Contributor Author

alexanderbez commented Oct 30, 2023

Hi @GAtom22 -- these changes are in respect to the store v2 work which is yet unreleased. I.e. this code does not exist on v0.47.x.

What specifically do you want to see on v0.47.x?

@GAtom22
Copy link
Contributor

GAtom22 commented Oct 30, 2023

Hi @GAtom22 -- these changes are in respect to the store v2 work which is yet unreleased. I.e. this code does not exist on v0.47.x.

What specifically do you want to see on v0.47.x?

Ahh, nvm, my bad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tune PebbleDB Options
4 participants