Profile-Guided Optimization (PGO) benchmark report #2385

zamazan4ik · 2024-10-08T18:05:29Z

zamazan4ik
Oct 8, 2024

Hi!

As I have done many times before, I decided to test the Profile-Guided Optimization (PGO) technique to optimize the gitui performance since the project (according to the README file) somehow cares about its runtime speed. For reference, results for other projects are available at https://github.com/zamazan4ik/awesome-pgo . Here are my benchmark results.

This information can be interesting for anyone who wants to achieve more performance with the application in their use cases.

Test environment

Fedora 40
Linux kernel 6.10.12
AMD Ryzen 9 5900x
48 Gib RAM
SSD Samsung 980 Pro 2 Tib
Compiler - Rustc 1.81.0
gitui version: master branch on commit 90a226927b226f33736f6c5cad150c895008a92a
Disabled Turbo boost

Benchmark

For benchmark purposes, I use the proposed in the "Benchmarks" section way - opening the Linux kernel repo. I use this repo with master branch and 87d6aab2389e5ce0197d8257d5f8ee965a67c4cd commit. The benchmark is the following:

Open GitUI on the "Log" tab and wait for the commit process to finish. Measure the time between the application start (I open it from the "Log" tab) and the parsing finish points

For PGO optimization I use cargo-pgo tool. Release binary is built with cargo build --release command. The PGO training binary is built with cargo pgo build, PGO optimized binary - with cargo pgo optimize build.

taskset -c 0 is used for all commands for reducing the OS scheduler's influence on the results. All measurements are done on the same machine, with the same background "noise" (as much as I can guarantee), and multiple times (at least 3 times each binary).

Results

I got the following results:

Release: 15.7s on average
PGO optimized: 13.3s on average

The results are consistent across runs. At least in this simple benchmark scenario, I see measurable improvement. Maybe it will be useful to know if we aim peak performance for the app.

Further steps

At the very least, the gitui's users can find this performance report and decide to enable PGO for their applications if they care about its performance in their workloads. Maybe a small note somewhere in the documentation (the README file?) will be enough to raise awareness about this work. Another way - try to figure out the root cause of performance differences between PGO and non-PGO gitui versions, and, probably, try to tweak the library sources a bit more - however this way also requires some time to analyze the resulting LLVM IR/assembly differences between them.

Also, Post-Link Optimization (PLO) can be tested after PGO. It can be done by applying tools like LLVM BOLT. However, it's a much less mature optimization technique compared to PGO.

Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Profile-Guided Optimization (PGO) benchmark report #2385

{{title}}

Replies: 0 comments

Select a reply

Profile-Guided Optimization (PGO) benchmark report #2385

zamazan4ik Oct 8, 2024

Test environment

Benchmark

Results

Further steps

Replies: 0 comments

zamazan4ik
Oct 8, 2024