Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we use lw/sw in push pop when we used ILP32, whether it's RV32 or RV64 #382

Open
Liaoshihua opened this issue May 19, 2023 · 3 comments

Comments

@Liaoshihua
Copy link

When we implemet ILP32 on RV64 ( this is patchs )
we found rv64 's calling convention increases the budget of the stack usage with ILP32 abi. So, should we use lw/sw in push pop when we used ILP32, whether it's RV32 or RV64

@guoren83 @kito-cheng @palmer-dabbelt @asb @jrtc27

Details :

For 64-bit ISA , callee can't determine the
correct width used in the register, so they saved the maximum width of
the ISA register, i.e., xlen size. We also found this rule in x86-x32,
mips-n32, and aarch64ilp32, which comes from rv64lp64.

Here are two downsides of this:

  • It would cause a difference with 32ilp32's stack frame, and s64ilp32
    reuses 32ilp32 software stack. Thus, many additional compatible
    problems would happen during the porting of 64ilp32 software.
  • It also increases the budget of the stack usage.
    <setup_vm>:
    auipc a3,0xff3fb
    add a3,a3,1234 # c0000000
    li a5,-1
    lui a4,0xc0000
    addw sp,sp,-96
    srl a5,a5,0x20
    subw a4,a4,a3
    auipc a2,0x111a
    add a2,a2,1212 # c1d1f000
    sd s0,80(sp)----+
    sd s1,72(sp) |
    sd s2,64(sp) |
    sd s7,24(sp) |
    sd s8,16(sp) |
    sd s9,8(sp) |-> All <= 32b widths, but occupy 64b
    sd ra,88(sp) | stack space.
    sd s3,56(sp) | Affect memory footprint & cache
    sd s4,48(sp) | performance.
    sd s5,40(sp) |
    sd s6,32(sp) |
    sd s10,0(sp)----+
    sll a1,a4,0x20
    subw a2,a2,a3
    and a4,a4,a5

So here is a proposal to riscv 64ilp32 ABI:

  • Let the compiler prevent callee saving ">32b variables" in
    callee-registers. (Q: We need to measure, how the influence of
    64b variables cross function call?)
@aswaterman
Copy link
Contributor

My sense is that this will be slightly preferable for most code, but there is risk that instruction-count increase in less-common cases will outweigh the benefits. Especially since this ABI decision would differ from x86 x32, I'd advise against doing it without quantitative justification. If you do SPEC CPU runs with both ABI designs, you'll see whether the perf boost from reduced D$ miss rate on some programs is outweighed by the instruction-count increase for 64b values that are live across calls.

@kito-cheng
Copy link
Collaborator

That kind of become no real callee-save register, one concern from me is that's relative easy for compiler, but it's hard to maintain and debug for assembly code, and seems this rely on an assumption is the HW/MMU will ignore highest 32-bit address otherwise ra and sp might got trouble.

Also here is a big question from me is: the main advantage of ilp32/rv64 is we can have native 64 bit operation, so that can be faster than ilp32/rv32, and this design seems will impact the performance for 64 bit operations, so I feel that contradicts, why not just using ilp32/rv32?

Of cause here is always an option is having both ABI flavor ilp32/RV64 for ordinary ABI and ilp32x/rv64 as this proposal, but before having more data and evidence to prove it's really worth we maintain such new ABI variant, I would prefer sticking to the ordinary ilp32/RV64 ABI.

So let having some benchmark data, like SPEC CPU AND EEMBC since SPEC CPU score might not representative for embedded world.

@guoren83
Copy link

guoren83 commented May 22, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants