Use divide and conquer in to_radix_digits #316

HKalbasi · 2024-12-16T19:09:55Z

This implements the algorithm mentioned in #315

Benchmark:

// prev
running 4 tests
test to_str_radix_10      ... bench:       4,242.41 ns/iter (+/- 712.43)
test to_str_radix_10_2    ... bench:      82,360.73 ns/iter (+/- 13,304.91)
test to_str_radix_10_3    ... bench:   3,929,829.90 ns/iter (+/- 514,647.85)
test to_str_radix_10_4    ... bench: 243,146,081.10 ns/iter (+/- 44,213,689.70)
// now
running 4 tests
test to_str_radix_10      ... bench:       4,261.38 ns/iter (+/- 522.98)
test to_str_radix_10_2    ... bench:      83,358.54 ns/iter (+/- 11,170.02)
test to_str_radix_10_3    ... bench:   2,623,301.20 ns/iter (+/- 279,505.89)
test to_str_radix_10_4    ... bench: 195,073,240.50 ns/iter (+/- 17,025,687.08)

Currently both grow with O(n^2), to make things algorithmically faster we need a faster multiplication and division algorithm.

cuviper · 2024-12-16T19:24:56Z

src/biguint/convert.rs

@@ -701,34 +701,48 @@ pub(super) fn to_radix_digits_le(u: &BigUint, radix: u32) -> Vec<u8> {
    // The threshold for this was chosen by anecdotal performance measurements to
    // approximate where this starts to make a noticeable difference.
    if digits.data.len() >= 64 {


Did you re-evaluate this threshold at all? Notably, it's different than the one you used in to_radix_digits_le_divide_and_conquer. Maybe that does make sense since the inner part doesn't have to pay for creating big_bases, but I'm not sure.

These are new results relevant for the threshold:

simple: test 1009 bit ... bench: 4,169.26 ns/iter (+/- 470.97) test 2009 bit ... bench: 14,735.97 ns/iter (+/- 1,819.63) test 3009 bit ... bench: 32,522.20 ns/iter (+/- 2,949.82) test 4009 bit ... bench: 56,441.64 ns/iter (+/- 6,354.65) divide and conquer: test 1009 bit ... bench: 5,955.14 ns/iter (+/- 859.07) test 2009 bit ... bench: 12,731.82 ns/iter (+/- 1,780.59) test 3009 bit ... bench: 18,701.03 ns/iter (+/- 2,284.40) test 4009 bit ... bench: 27,605.41 ns/iter (+/- 5,229.87)

So probably 2000/64 ~ 32 make sense as new threshold?

since the inner part doesn't have to pay for creating big_bases, but I'm not sure

If I understand correctly, the main difference in small numbers is that the recursive algorithm loses const propagation for 10. If it wasn't the case, I'd expect some threshold near 8.

What exactly did you change for your new results?

In benchmarks? I changed them to this:

#[bench] fn to_str_radix_10(b: &mut Bencher) { to_str_radix_bench(b, 10, 1009); } #[bench] fn to_str_radix_10_2(b: &mut Bencher) { to_str_radix_bench(b, 10, 2009); } #[bench] fn to_str_radix_10_3(b: &mut Bencher) { to_str_radix_bench(b, 10, 3009); } #[bench] fn to_str_radix_10_4(b: &mut Bencher) { to_str_radix_bench(b, 10, 4009); }

And I changed if digits.data.len() >= 64 { to if digits.data.len() >= 1 { and if digits.data.len() >= 1000 {.

HKalbasi · 2024-12-18T13:03:49Z

I would like to implement Burnikel Ziegler for fast division to make to_radix even faster. Would you rather have it in this PR or merge this alone?

Use divide and conquer in to_radix_digits

0dd9969

cuviper reviewed Dec 16, 2024

View reviewed changes

Update the threshold in to_radix_digits

d6313e2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use divide and conquer in to_radix_digits #316

Use divide and conquer in to_radix_digits #316

HKalbasi commented Dec 16, 2024

cuviper Dec 16, 2024

HKalbasi Dec 16, 2024 •

edited

Loading

cuviper Dec 16, 2024

HKalbasi Dec 16, 2024

HKalbasi commented Dec 18, 2024 •

edited

Loading

Use divide and conquer in to_radix_digits #316

Are you sure you want to change the base?

Use divide and conquer in to_radix_digits #316

Conversation

HKalbasi commented Dec 16, 2024

cuviper Dec 16, 2024

Choose a reason for hiding this comment

HKalbasi Dec 16, 2024 • edited Loading

Choose a reason for hiding this comment

cuviper Dec 16, 2024

Choose a reason for hiding this comment

HKalbasi Dec 16, 2024

Choose a reason for hiding this comment

HKalbasi commented Dec 18, 2024 • edited Loading

HKalbasi Dec 16, 2024 •

edited

Loading

HKalbasi commented Dec 18, 2024 •

edited

Loading