Profile-Guided Optimization (PGO) benchmark report #2385
zamazan4ik
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi!
As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the
gitui
performance since the project (according to the README file) somehow cares about its runtime speed. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Here are my benchmark results.This information can be interesting for anyone who wants to achieve more performance with the application in their use cases.
Test environment
gitui
version:master
branch on commit90a226927b226f33736f6c5cad150c895008a92a
Benchmark
For benchmark purposes, I use the proposed in the "Benchmarks" section way - opening the Linux kernel repo. I use this repo with
master
branch and87d6aab2389e5ce0197d8257d5f8ee965a67c4cd
commit. The benchmark is the following:For PGO optimization I use cargo-pgo tool. Release binary is built with
cargo build --release
command. The PGO training binary is built withcargo pgo build
, PGO optimized binary - withcargo pgo optimize build
.taskset -c 0
is used for all commands for reducing the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee), and multiple times (at least 3 times each binary).Results
I got the following results:
The results are consistent across runs. At least in this simple benchmark scenario, I see measurable improvement. Maybe it will be useful to know if we aim peak performance for the app.
Further steps
At the very least, the gitui's users can find this performance report and decide to enable PGO for their applications if they care about its performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work. Another way - try to figure out the root cause of performance differences between PGO and non-PGO gitui versions, and, probably, try to tweak the library sources a bit more - however this way also requires some time to analyze the resulting LLVM IR/assembly differences between them.
Also, Post-Link Optimization (PLO) can be tested after PGO. It can be done by applying tools like LLVM BOLT. However, it's a much less mature optimization technique compared to PGO.
Thank you.
Beta Was this translation helpful? Give feedback.
All reactions