Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimise wit_infer_by_expr util function #772

Open
hero78119 opened this issue Dec 18, 2024 · 0 comments
Open

optimise wit_infer_by_expr util function #772

hero78119 opened this issue Dec 18, 2024 · 0 comments

Comments

@hero78119
Copy link
Collaborator

hero78119 commented Dec 18, 2024

Context

expression witness inferring util wit_infer_by_expr both used in in opcode and table proof
occupied quite lot of time, e.g.

with fibonacchi e2e command cargo run --release --package ceno_zkvm --bin e2e -- --profiling=3 --max-steps=1048576 --platform=sp1 ceno_zkvm/examples/fibonacci.elf

and take "ADD" opcode as example

TRACE       ?? create_opcode_proof [ 574ms | 0.03% / 15.39% ] circuit_name: "ADD"
TRACE       ?  ?? wit_inference [ 273ms | 7.30% ] profiling_3: true
TRACE       ?  ?? SUMCHECK [ 186ms | 4.97% ] profiling_3: true
TRACE       ?  ?? witin::evals [ 12.9ms | 0.35% ] profiling_3: true
TRACE       ?  ?? pcs_open [ 102ms | 2.74% ] profiling_3: true

the wit_inference inferring occupied around 50%.

Preliminary review and the overhead come from creating much intermediate vector instead of reuse and mutating. Besides, the expression processing here degree is just 1, so we can exploit this structure and apply optimisation on it wih new util function wit_infer_by_expr_degree_1

Idea of optimisation

  • allocate just one mutable vector and do in-placement change during witness inferrence
  • leverage grid-stride-loop to traverse vector

The (simulated) gain of idea

In this commit
master...hero78119:ceno:feat/wit_infer_opt

  • we keep the whole traverse (but not updating mutable vector, as we need to figure out how to pass mutable vector during expression evaluation
  • the traverse change to grid-stride-loop style

and the benchmark result

fibonacci_max_steps_1048576/prove_fibonacci/fibonacci_max_steps_1048576
                        time:   [4.1000 s 4.1310 s 4.1591 s]
                        change: [-16.733% -15.242% -13.712%] (p = 0.00 < 0.05)
                        Performance has improved.

So this optimisation show great potiential and bring up to 15% e2e latency reduction

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant