Optimize `ReadCacheLookup`, 6.7 speed-up #517

arvidn · 2024-12-20T10:31:55Z

This is a major hot-spot when serializing with back references. This patch speeds it up by:

pre-allocating hash maps and hash sets
using IdentityHash, to avoid hashing the tree-hashes again
exiting early if the structure we're trying to deduplicate is too small (i.e. trade-off compression in favor of speed)
use BitVec instead of Vec<u8> to store paths (which only contain 0 and 1 anyway).

Compared to main, the relevant benchmarks indicate a time reduction of between 80-90%, of serializing clvm with back references.

$ cargo bench node_to_bytes_backrefs -- --baseline main
     Running benches/serialize.rs (target/release/deps/serialize-21b0d813d80dbf8a)
Benchmarking serialize/node_to_bytes_backrefs 0: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 41.6s, or reduce sample count to 10.
serialize/node_to_bytes_backrefs 0
                        time:   [417.23 ms 420.23 ms 423.98 ms]
                        change: [-85.682% -85.544% -85.390%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  4 (4.00%) high mild
  7 (7.00%) high severe
Benchmarking serialize/node_to_bytes_backrefs 1: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 48.2s, or reduce sample count to 10.
serialize/node_to_bytes_backrefs 1
                        time:   [478.85 ms 481.87 ms 485.32 ms]
                        change: [-91.144% -91.077% -91.007%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 19 outliers among 100 measurements (19.00%)
  8 (8.00%) high mild
  11 (11.00%) high severe

One concern is that, not mixing in any random state in the hasher, opens up for hash-collision attacks. I'm not sure if it's worth addressing or what the best way would be. Maybe add a random value to the IdentityHash object, and add it to the hash value.

coveralls-official · 2024-12-20T10:40:37Z

Pull Request Test Coverage Report for Build 12429902792

Details

56 of 58 (96.55%) changed or added relevant lines in 1 file are covered.
1 unchanged line in 1 file lost coverage.
Overall coverage decreased (-0.004%) to 93.367%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/serde/read_cache_lookup.rs	56	58	96.55%

Files with Coverage Reduction	New Missed Lines	%
src/serde/read_cache_lookup.rs	1	97.85%

Totals
Change from base Build 12418419499:	-0.004%
Covered Lines:	5588
Relevant Lines:	5985

💛 - Coveralls

src/serde/read_cache_lookup.rs

richardkiss · 2024-12-20T20:37:33Z

src/serde/read_cache_lookup.rs

+    }
+
+    fn write_u64(&mut self, _i: u64) {
+        panic!("This hasher only takes bytes");


I don't know if this is necessary. If you want to use a standalone hasher I tried a couple years back, you could steal it from richardkiss/clvm_rs@50f6c94#diff-b696b12532e34a5d93b4bf96927a9049a2ad3d7c04bfde786611f6a560622e2f ; I don't think it'd be much slower, if at all.

this "hasher" depends on the input already having hash-properties. If anyone, for some reason, puts a u64 into it, that assumption likely no longer holds. I think it's preferable to panic in that case.

That said, the one risk I see with this is that there's no salt, so an attacker can grind hashes that causes worst-case performance of the hash set/map.

Possible solution; don't always take the first eight bytes, but randomly select an offset between 0 and 24 to start taking the bytes.

True, that only makes it 24 times as hard. But we could choose, say, five random offsets between 0 and 24 and eor them together. Then it's 24**5 times as hard.

One attack I see would be, create a legit spend that has so many hash tree collisions that the hash tree gets O(n^2) performance. If we only have a 1/24 chance of having that happen, it won't be long before it gets included in a block by someone who chooses a failing offset, and then the spend is gone forever.

That's really the only attack I can think of.

src/serde/read_cache_lookup.rs

richardkiss

Looks good. I'd be curious to know if the u8 -> bool change does anything. But it seems at worst harmless.

arvidn · 2024-12-28T10:30:56Z

There are still some worst-case scenarios where this isn't fast enough to serialize a full block. I'm working on a more ambitions PR to speed this up more.

arvidn · 2024-12-29T11:47:12Z

a material portion of the speed-up in this patch comes from not using a salt in the hasher. Which is a bit surprising, but I suppose we initialize new HashSet and HashMaps quite often. adding back the salt makes the benchmarks take 3x the time

arvidn added 2 commits December 20, 2024 11:34

optimize ReadCacheLookup::find_paths()

b1e41ef

use bitset instead of vector to record paths

8671a04

arvidn force-pushed the optimize-compression branch from a464e4d to 8671a04 Compare December 20, 2024 10:36

arvidn requested a review from richardkiss December 20, 2024 10:39

richardkiss reviewed Dec 20, 2024

View reviewed changes

src/serde/read_cache_lookup.rs Show resolved Hide resolved

richardkiss reviewed Dec 20, 2024

View reviewed changes

src/serde/read_cache_lookup.rs Show resolved Hide resolved

richardkiss reviewed Dec 20, 2024

View reviewed changes

src/serde/read_cache_lookup.rs Show resolved Hide resolved

richardkiss approved these changes Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `ReadCacheLookup`, 6.7 speed-up #517

Optimize `ReadCacheLookup`, 6.7 speed-up #517

arvidn commented Dec 20, 2024 •

edited

Loading

coveralls-official bot commented Dec 20, 2024 •

edited

Loading

richardkiss Dec 20, 2024

arvidn Dec 21, 2024

richardkiss Dec 24, 2024

richardkiss Dec 24, 2024

richardkiss left a comment

arvidn commented Dec 28, 2024

arvidn commented Dec 29, 2024

Optimize ReadCacheLookup, 6.7 speed-up #517

Are you sure you want to change the base?

Optimize ReadCacheLookup, 6.7 speed-up #517

Conversation

arvidn commented Dec 20, 2024 • edited Loading

coveralls-official bot commented Dec 20, 2024 • edited Loading

Pull Request Test Coverage Report for Build 12429902792

Details

💛 - Coveralls

richardkiss Dec 20, 2024

Choose a reason for hiding this comment

arvidn Dec 21, 2024

Choose a reason for hiding this comment

richardkiss Dec 24, 2024

Choose a reason for hiding this comment

richardkiss Dec 24, 2024

Choose a reason for hiding this comment

richardkiss left a comment

Choose a reason for hiding this comment

arvidn commented Dec 28, 2024

arvidn commented Dec 29, 2024

Optimize `ReadCacheLookup`, 6.7 speed-up #517

Optimize `ReadCacheLookup`, 6.7 speed-up #517

arvidn commented Dec 20, 2024 •

edited

Loading

coveralls-official bot commented Dec 20, 2024 •

edited

Loading