Make benchmarking 25% faster by parallelizing benchmarking venv setup #375

Fidget-Spinner · 2024-12-16T14:48:36Z

Fixes #374.

On my computer it takes roughly 90 minutes for a single pyperformance run.

Setting up (compiling and venv installation) takes 45 minutes. We can cut that down.

Before (no compilation, just installation):

real    31m40.161s
user    1m23.838s
sys     0m9.785s

After:

real    6m45.913s
user    3m37.286s
sys     0m22.162s

Roughly 5x faster setup (without compilation). Saving roughly 25 minutes, and thus 25% (roughly) of benchmarking time.

Edit: this only applies to older pip (22 and below)

mdboom · 2024-12-16T15:34:55Z

This is cool. Thanks for taking this on. Linking to a related bug.

There's a few things I'm puzzled by. On our benchmark machines and locally for me, creating all of the virtual environments takes ~3min not 30, which is I why I decided to work on benchmark running time first. I wonder what the difference is -- I don't think you're machine would be that much worse. I seem to be getting pip 24.2 -- maybe yours is different? If we can get to the bottom of that, maybe we can get the 10x improvement without introducing race conditions (see below). My timings are with a hot package cache so it's not doing a lot of downloading, but if downloading were the bottleneck, I wouldn't expect this patch to make things faster (which it does for both of us, though by a smaller margin for me).

Are we sure that pip install-ing into the same venv is thread safe? This issue seems to suggest it isn't. We may just be getting lucky here. The current combination of pyperformance/python_macrobenchmarks only has a single benchmark (pytorch-alexnet-inference) that doesn't fit into the common venv. If we had more that didn't, would we run into more problems? It's clear that with this change one could end up with non-deterministic sets of venvs (since there would be a race condition on which benchmark can add its dependencies to the common environment first), and since venv repo time can affect Python startup time, then we've unintentionally introduced (more) non-determinacy into the benchmarks.

Fidget-Spinner · 2024-12-16T15:42:02Z

Are we sure that pip install-ing into the same venv is thread safe?

I don't think it's installing into the same venv though. I was under the assumption that it's installing into separate venvs.

On our benchmark machines and locally for me, creating all of the virtual environments takes ~3min not 30

Oh no. Can you please give the version of Python you're using and your OS? For context, a lot of things don't fit into the common venv for me.

Edit: I'm also using pip 22.3

Fidget-Spinner · 2024-12-16T15:49:11Z

Ah found it. It's the pip version! 22 is significantly slower than >= 23.

New pip brings down benchmark setup to just 1min30s on my computer. Thanks new pip and pypa team :)!

mdboom · 2024-12-16T15:54:31Z

I don't think it's installing into the same venv though. I was under the assumption that it's installing into separate venvs.

For each benchmark, it tries to install in the "common" venv, and if that fails (usually due to a version conflict) it creates a new benchmark-specific venv.

When I have it build for all of pyperformance + python_macrobenchmarks, I get the following under the venv directory (meaning it created 2 venvs, 1 common and 1 just for pytorch_alexnet_inference):

cpython3.14-f90d3bc7a4bc-compat-cb6a104a6688
cpython3.14-f90d3bc7a4bc-compat-cb6a104a6688-bm-pytorch_alexnet_inference

How many do you get?

Oh no. Can you please give the version of Python you're using and your OS? For context, a lot of things don't fit into the common venv for me.

I'm using Python 3.11 as the "driver" of pyperformance, but benchmarking with CPython main. I see the same timings across all of the OS's -- Windows about 6 minutes rather than 3, but that's not surprising on something I/O bound. These are the OS details:

linux: Intel® Xeon® W-2255 CPU @ 3.70GHz, running Ubuntu 20.04 LTS, gcc 9.4.0
linux2: 12th Gen Intel® Core™ i9-12900 @ 2.40 GHz, running Ubuntu 22.04 LTS, gcc 11.3.0
linux-aarch64: ARM Neoverse N1, running Ubuntu 22.04 LTS, gcc 11.4.0
macos: M1 arm64 Mac® Mini, running macOS 13.2.1, clang 1400.0.29.202
windows: 12th Gen Intel® Core™ i9-12900 @ 2.40 GHz, running Windows 11 Pro (21H2, 22000.1696), MSVC v143

mdboom · 2024-12-16T15:55:39Z

New pip brings down benchmark setup to just 1min30s on my computer. Thanks new pip and pypa team :)!

Oh, that's cool. Glad that helps. I think it's safest to just stick with that for now. Your fix here may be useful in the future if we need it and/or pip guarantees thread-safety. uv apparently already does.

Fidget-Spinner · 2024-12-16T15:57:21Z

Ok. Closing this and will re-open if we switch to uv. But uv is probably fast enough anyways that we wouldn't need this.

Parallelize benchmarking venv setup

ecbfbf0

Fidget-Spinner requested review from mdboom, colesbury, corona10 and ericsnowcurrently December 16, 2024 14:48

Fidget-Spinner added 3 commits December 16, 2024 22:51

fix bug

db29d34

Use threads because windows is unhappy

aefde59

switch back to processpoolexecutor

0cfc548

Fidget-Spinner closed this Dec 16, 2024

mdboom mentioned this pull request Dec 16, 2024

Investigate ways to speed up venv creation faster-cpython/bench_runner#317

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make benchmarking 25% faster by parallelizing benchmarking venv setup #375

Make benchmarking 25% faster by parallelizing benchmarking venv setup #375

Fidget-Spinner commented Dec 16, 2024 •

edited

Loading

mdboom commented Dec 16, 2024

Fidget-Spinner commented Dec 16, 2024 •

edited

Loading

Fidget-Spinner commented Dec 16, 2024 •

edited

Loading

mdboom commented Dec 16, 2024

mdboom commented Dec 16, 2024

Fidget-Spinner commented Dec 16, 2024

Make benchmarking 25% faster by parallelizing benchmarking venv setup #375

Make benchmarking 25% faster by parallelizing benchmarking venv setup #375

Conversation

Fidget-Spinner commented Dec 16, 2024 • edited Loading

mdboom commented Dec 16, 2024

Fidget-Spinner commented Dec 16, 2024 • edited Loading

Fidget-Spinner commented Dec 16, 2024 • edited Loading

mdboom commented Dec 16, 2024

mdboom commented Dec 16, 2024

Fidget-Spinner commented Dec 16, 2024

Fidget-Spinner commented Dec 16, 2024 •

edited

Loading

Fidget-Spinner commented Dec 16, 2024 •

edited

Loading

Fidget-Spinner commented Dec 16, 2024 •

edited

Loading