-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance #274
base: dev
Are you sure you want to change the base?
Improve performance #274
Conversation
Those are very impressive numbers. Thank you for looking into this! I appreciate getting to see the unified vision of where everything is going. Obviously, with this much going on we are not going to be able to review it all in one PR. It's up to you how you want to divide things up into reviewable pieces. @konstin just added some really cool automation for running our benchmarks in #271, which apparently I forgot to click merge on. So if we get the benchmark merged first we can easily and automatically see the improvements from the other pieces. The non breaking changes do not seem all that controversial. So they may be easy to land. Of course, they may not pull their weight until the other (API-breaking) optimizations have come in. The breaking changes will need to be looked at very carefully. Certainly each as their own PR. An order of magnitude performance improvement is nothing to sneeze at, but performance is not our only priority. Our existing API is careful to be "misuse resistant" in that it is hard to get an incorrect answer because you were "holding it wrong". I look forward to reviewing this work, one piece at a time, over the next weeks/months. |
I wonder if any of these changes can be broken off into smaller, non-breaking PRs? |
b4ae053
to
e559617
Compare
That's an impressive speedup, and some deep changes! Long and ramble-y comment ahead, here are some unstructured thoughts about this: From spending a lot of time optimizing uv, i found that often making pubgrub do less work is the most effective strategy, for example by package prioritization (https://github.com/astral-sh/uv/blob/14507a1793f12bf884786be49d716ca8f8322340/crates/uv-resolver/src/pubgrub/priority.rs#L25) or astral-sh/uv#8157. If we're checking a large (exponential) combination of packages of versions, trying to get to a package selection that decides things that are either easy/clear or that reduce the remaining space a lot avoids doing work in pubgrub. How did you create By splitting out the non-breaking changes, we can bench them in uv. We need to decide which optimizations are in pubgrub and which are outside. For example, in uv we I'm not following along with the new What version type is cargo currently using, can we do an optimization for small versions? One thing a had pitched to @Eh2406 earlier:
|
This file was created from cargo, using https://github.com/x-hgg-x/pubgrub-bench, by dumping the output of
See Eh2406/pubgrub-crates-benchmark#6 (comment).
I will make separate PRs for each commit, this PR is only used to show the combined performance impact.
To use it, a dependency provider will need to store a list of all versions for a given package, which should be sorted by priority so that To allow for more than 63 versions of a package, virtual packages need to be defined to ensure package unicity and to allow more than 63 possible versions of a dependency:
Note that this complexifies version choosing, but we can avoid difficulties by sorting versions by priority in the dependency provider. |
0efe2d9
to
ec45c54
Compare
This is an impressive speedup. I don’t have the bandwidth personally to review this unfortunately so I’ll just trust the other pairs of eyes already on this. Few remarks, I agree with the others about the possibilities to split this between non-controversial changes and more controversial changes. But it seems you already agree with that too so all good. Another broad remark, I feel like some of the speedups there rely on specific representations for versions if I understand correctly what you did with the u64 bitset. I’d just want to point out the fact that it’s important to keep the overall API generic so that version sets details of implementation can stay user-specific and not embedded directly into pubgrub assumptions. Nothing else to add except thank you for all that work! and I’m excited to (sporadically) follow along. |
Dependency providers can still internally use whatever version format the want, but with this PR Pubgrub only accepts discrete versions for a dependency in the This means we need to list all versions of a dependency matching the requirement in the The |
I will rename |
064d143
to
f2c6859
Compare
CodSpeed Performance ReportMerging #274 will improve performances by 35.2%Comparing Summary
Benchmarks breakdown
|
8ff1870
to
193e5b0
Compare
Those are very exciting numbers! What piece do you want to work on next? The first of your commits is 5834083 |
The other non-breaking changes don't give any perf gain without the |
3e973d7
to
ebdc595
Compare
810b1ce
to
5cedffb
Compare
0b15612
to
4fdb814
Compare
I added a method in the dependency provider to register a conflict when backtracking for better package prioritization, and it shows good results for Before:
After:
Conflict map:
|
cc26e4b
to
9abac20
Compare
b0ddb5f
to
080a443
Compare
This PR is an experiment to improve performance of Pubgrub.
There are many API breaking changes which should be reviewed before this can be merged.
The following benchmarks are done using the new bench file
bench_backtracking.ron
:dev
branch (57971c4)Results for the
large_case
bench:dev
branch (57971c4)Results for the
sudoku
bench:dev
branch (57971c4)Implementation example for
cargo
: https://github.com/x-hgg-x/pubgrub-bench