Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce stack usage by boxing File in Dist, CachePolicy and large futures #1004

Merged
merged 4 commits into from
Jan 19, 2024

Conversation

konstin
Copy link
Member

@konstin konstin commented Jan 19, 2024

This is #947 again but this time merging into main instead of downstack, sorry for the noise.


Windows has a default stack size of 1MB, which makes puffin often fail with stack overflows. The PR reduces stack size by three changes:

  • Boxing File in Dist, reducing the size from 496 to 240.
  • Boxing the largest futures.
  • Boxing CachePolicy

Method

Debugging happened on linux using #941 to limit the stack size to 1MB. Used ran the command below.

RUSTFLAGS=-Zprint-type-sizes cargo +nightly build -p puffin-cli -j 1 > type-sizes.txt && top-type-sizes -w -s -h 10 < type-sizes.txt > sizes.txt

The main drawback is top-type-sizes not saying what the __awaitee is, so it requires manually looking up with a future with matching size.

When the brotli features on reqwest is active, a lot of brotli types show up. Toggling this feature however seems to have no effect. I assume they are false positives since the brotli crate has elaborate control about allocation. The sizes are therefore shown with the feature off.

Results

The largest future goes from 12208B to 6416B, the largest type (PrioritizedDistribution, see also #948) from 17448B to 9264B. Full diff: https://gist.github.com/konstin/62635c0d12110a616a1b2bfcde21304f

For the second commit, i iteratively boxed the largest file until the tests passed, then with an 800KB stack limit looked through the backtrace of a failing test and added some more boxing.

Quick benchmarking showed no difference:

$ hyperfine --warmup 2 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/puffin-dev resolve meine_stadt_transparent" 
Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent
  Time (mean ± σ):      49.2 ms ±   3.0 ms    [User: 39.8 ms, System: 24.0 ms]
  Range (min … max):    46.6 ms …  63.0 ms    55 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: target/profiling/puffin-dev resolve meine_stadt_transparent
  Time (mean ± σ):      47.4 ms ±   3.2 ms    [User: 41.3 ms, System: 20.6 ms]
  Range (min … max):    44.6 ms …  60.5 ms    62 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  target/profiling/puffin-dev resolve meine_stadt_transparent ran
    1.04 ± 0.09 times faster than target/profiling/main-dev resolve meine_stadt_transparent

`Dist` was standing out when profiling stack sizes with top-type-sizes. Here, we trade an allocation per `Dist` for a more reasonable stack size.
Looking through the stack trace of `allowed_transitive_url_dependency` with a 800KB stack, i found those two to be the major offenders. These changes make `allowed_transitive_url_dependency` pass with a 800KB stack.
@konstin konstin added the windows Specific to the Windows platform label Jan 19, 2024
@konstin konstin enabled auto-merge (squash) January 19, 2024 09:37
@konstin konstin merged commit 47fc90d into main Jan 19, 2024
3 checks passed
@konstin konstin deleted the konsti/reduce-stack-usage branch January 19, 2024 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
windows Specific to the Windows platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant