Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local evaluation: use nixpkgs ci.eval parallel implementation #440

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GaetanLepage
Copy link
Contributor

@GaetanLepage GaetanLepage commented Dec 22, 2024

This PR aims at improving local evaluation.

Currently, it uses a (single-threaded) nix-env call per system to evaluate.
This new patch leverages the recent ci.eval parallel evaluation implementation in nixpkgs itself.

Current limitations:

  • Not respecting --allow as we used to (maybe it is possible...)
  • It requires carefully tuning the --max-jobs, --cores and chunkSize parameters of ci.full
    We need to expose new options for those and provide sensible (automatic ?) default values.
  • The listing of "Impacted packages" on stdout is now limited to listing all "rebuilds" (i.e. added + changed packages). Before, it was more detailed (removed, changed and added I think). This information is not available in the output of the eval.compare derivation. If needed, we could change the upstream implem...

TODO

  • Handle --allow user preferences
  • Fix tests which all got broken by this patch
  • (Automatically) set sensible default values for the eval.full parameters (--max-jobs, --cores and --arg chunkSize)

cc @Mic92 @zowoq @khaneliman

@GaetanLepage
Copy link
Contributor Author

GaetanLepage commented Dec 22, 2024

For now, I initialize the parameters following the recommendations of the ci.eval README:

  • --max-jobs -> n_systems
  • --cores -> n_cpu // n_systems (n_cpu is the number of logical cores, i.e. threads on the system)
  • chunkSize -> 10_000

I conducted early and naive benchmarks on the following system:

  • 6 cores / 12 threads (Ryzen 5600X)
  • 64GB of RAM

I ran the evaluation for the four supported systems (x86_64-linux, aarch64-linux, x86_64-darwin, aarch64-darwin):

method duration (mm:ss) max RAM usage number of cores fully used
legacy nix-env 2:25 53 GB 4 (1 per system evaluated)
new ci.eval.full 4:50 39GB 12

So unfortunately, it appears that this method is slower than our current one... Maybe tuning the hyper-parameters can give better results, but as we are maxing out the CPU, I don't expect a lot of gains.
Nevertheless, the new method allows for finer control of RAM usage (not everyone has 64GB of RAM).

Note: using chunkSize=20000, the time went to 4:25.

cc @infinisil, maybe you will have more insights.

@infinisil
Copy link

You should be able to get the same speed as the legacy nix-env by setting the chunk size to like 1000000 (there's about 100000 attributes right now) at least, but to get the best speed I don't have a better answer than just to try a bunch of different numbers for the chunk size

@Mic92
Copy link
Owner

Mic92 commented Dec 23, 2024

How evenly is the max memory usage spread across different chunks?
After all I consider this also a parameter that is useful just for keeping memory usage within certain boundaries.

@GaetanLepage
Copy link
Contributor Author

How evenly is the max memory usage spread across different chunks?

I don't really know. The overall RAM usage fluctuates and peaks very briefly at ~39GB (using chunkSize = 10k).
I haven't made more precise measures.

@Mic92
Copy link
Owner

Mic92 commented Dec 24, 2024

It feels like we need to solve this in nix. i.e. track RAM usage there and restart the interpreter after a threshold is breached.

@GaetanLepage
Copy link
Contributor Author

So, what should we do about this ?
While I find this new implem more satisfying, cleaner and allowing finer control over the evaluation process, one must admit that it is significantly less performant.

@Mic92
Copy link
Owner

Mic92 commented Jan 2, 2025

So, what should we do about this ? While I find this new implem more satisfying, cleaner and allowing finer control over the evaluation process, one must admit that it is significantly less performant.

Maybe the new method could be a different branch for the time being. Some people might have machines with enough resources, where it actually speeds up things. I am wondering if we can leverage libnix somehow to restart evaluation when it consumes to much RAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants