Block Size effect on RocksDB performance on State #9370

jbajic · 2023-07-28T15:30:17Z

jbajic
Jul 28, 2023

Introduction

For this, a tool for running RocksDB compaction was added to database tooling. Before running the state-perf benchmark, first, we compact the database with a specific block size and then run the state-perf database tool. The factor that needs to be taken into consideration is definitely that this is the perfect optimized state of storage when is fully compacted and the number of SST files is significantly reduced since the triggered compaction is run on all LSM levels.

Measurements

Since it takes a long time to compact the rocksdb four SSDs were created and compacted with different block sizes. The process included cloning State four times for every block size we want on the same n2-highcpu-16 GCP node on four different GCP persistent disks and then running compaction for different block sizes.

Here are also fio results for every disk since four different disks were used:

Disk 1

Fio tool results:

read: IOPS=18.6k, BW=72.7MiB/s (76.2MB/s)(4361MiB/60001msec) clat (usec): min=157, max=11829, avg=428.97, stdev=139.60 lat (usec): min=157, max=11829, avg=429.06, stdev=139.60

4KiB block size

overall | avg observed_latency: 516.923µs, block_read_time: 490.688µs, samples with merge: 0 (0.00%)
block_read_count: 1, samples: 99993 (99.99%): | avg observed_latency: 516.643µs, block_read_time: 490.479µs, samples with merge: 0 (0.00%)
block_read_count: 6, samples: 7 (0.01%): | avg observed_latency: 4.502067ms, block_read_time: 3.465625ms, samples with merge: 0 (0.00%)

Disk 2

Fio tool results:

read: IOPS=18.0k, BW=74.1MiB/s (77.7MB/s)(4446MiB/60001msec) clat (usec): min=163, max=7789, avg=420.75, stdev=135.77 lat (usec): min=163, max=7789, avg=420.84, stdev=135.77

8 KiB block size

overall | avg observed_latency: 540.224µs, block_read_time: 507.651µs, samples with merge: 0 (0.00%)
block_read_count: 1, samples: 99995 (100.00%): | avg observed_latency: 540.09µs, block_read_time: 507.554µs, samples with merge: 0 (0.00%)
block_read_count: 6, samples: 5 (0.01%): | avg observed_latency: 3.236723ms, block_read_time: 2.450588ms, samples with merge: 0 (0.00%)

Disk 3

Fio tool results:

read: IOPS=18.6k, BW=72.6MiB/s (76.1MB/s)(4354MiB/60001msec) clat (usec): min=157, max=7907, avg=429.67, stdev=146.28 lat (usec): min=158, max=7907, avg=429.75, stdev=146.28

16 KiB block size

overall | avg observed_latency: 612.676µs, block_read_time: 574.418µs, samples with merge: 0 (0.00%)
block_read_count: 1, samples: 99997 (100.00%): | avg observed_latency: 612.605µs, block_read_time: 574.37µs, samples with merge: 0 (0.00%)
block_read_count: 6, samples: 3 (0.00%): | avg observed_latency: 2.9739ms, block_read_time: 2.162229ms, samples with merge: 0 (0.00%)

Disk 4

Fio tool results:

read: IOPS=18.9k, BW=73.8MiB/s (77.4MB/s)(4430MiB/60001msec) clat (usec): min=154, max=7154, avg=422.20, stdev=128.46 lat (usec): min=154, max=7154, avg=422.29, stdev=128.46

32 KiB block size

overall | avg observed_latency: 689.518µs, block_read_time: 640.242µs, samples with merge: 0 (0.00%)
block_read_count: 1, samples: 99997 (100.00%): | avg observed_latency: 689.442µs, block_read_time: 640.192µs, samples with merge: 0 (0.00%)
block_read_count: 6, samples: 3 (0.00%): | avg observed_latency: 3.22738ms, block_read_time: 2.28777ms, samples with merge: 0 (0.00%)

From the fio results from multiple disks, we can conclude that their speed is very similar and standard deviation as well meaning that the comparisons of different block sizes from them make sense.

And from the results of using different block sizes here, it seems that the 4KiB block size yields the lowest latency for random access for the State column. Since there are no benefits from employing data locality and having bigger block sizes here we can see that having smaller blocks makes the performance of RocksDB better for random unique reads, the reason for that being that pages on SSD are also 4KiB matching RocksDB configuration.

Longarithm · 2023-07-28T15:47:41Z

Longarithm
Jul 28, 2023
Maintainer

So we should switch to 4 KiB blocks? I'm curious how this will perform on upcoming Sweat benchmark.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block Size effect on RocksDB performance on State #9370

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Block Size effect on RocksDB performance on State #9370

jbajic Jul 28, 2023

Introduction

Measurements

Disk 1

Disk 2

Disk 3

Disk 4

Replies: 1 comment

Longarithm Jul 28, 2023 Maintainer

jbajic
Jul 28, 2023

Longarithm
Jul 28, 2023
Maintainer