Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for wasm exception handling to Emscripten target #131830

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hoodmane
Copy link
Contributor

@hoodmane hoodmane commented Oct 17, 2024

This is a draft because we need some additional setting for the Emscripten target to select between the old exception handling and the new exception handling. I don't know how to add a setting like that, would appreciate advice from Rust folks. We could maybe choose to use the new exception handling if Ctarget-feature=+exception-handling is passed? I tried this but I get errors from llvm so I'm not doing it right.

cc @kleisauke @kripken @sbc100 @workingjubilee @alexcrichton

@rustbot
Copy link
Collaborator

rustbot commented Oct 17, 2024

r? @compiler-errors

rustbot has assigned @compiler-errors.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 17, 2024
@hoodmane hoodmane marked this pull request as draft October 17, 2024 12:05
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-18 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
------
 > importing cache manifest from ghcr.io/rust-lang/rust-ci-cache:c32c805632780b5c1de330e3f44561b336c2efe163bc0990acb392390157a8e1d9f855d75914a239aa40c49d77f4a837247d05d2f8d46f554b98e1f46712a3e3:
------
##[endgroup]
Setting extra environment values for docker:  --env ENABLE_GCC_CODEGEN=1 --env GCC_EXEC_PREFIX=/usr/lib/gcc/
[CI_JOB_NAME=x86_64-gnu-llvm-18]
debug: `DISABLE_CI_RUSTC_IF_INCOMPATIBLE` configured.
---
sccache: Starting the server...
##[group]Configure the build
configure: processing command line
configure: 
configure: build.configure-args := ['--build=x86_64-unknown-linux-gnu', '--llvm-root=/usr/lib/llvm-18', '--enable-llvm-link-shared', '--set', 'rust.randomize-layout=true', '--set', 'rust.thin-lto-import-instr-limit=10', '--set', 'change-id=99999999', '--enable-verbose-configure', '--enable-sccache', '--disable-manage-submodules', '--enable-locked-deps', '--enable-cargo-native-static', '--set', 'rust.codegen-units-std=1', '--set', 'dist.compression-profile=balanced', '--dist-compression-formats=xz', '--set', 'rust.lld=false', '--disable-dist-src', '--release-channel=nightly', '--enable-debug-assertions', '--enable-overflow-checks', '--enable-llvm-assertions', '--set', 'rust.verify-llvm-ir', '--set', 'rust.codegen-backends=llvm,cranelift,gcc', '--set', 'llvm.static-libstdcpp', '--enable-new-symbol-mangling']
configure: target.x86_64-unknown-linux-gnu.llvm-config := /usr/lib/llvm-18/bin/llvm-config
configure: llvm.link-shared     := True
configure: rust.randomize-layout := True
configure: rust.thin-lto-import-instr-limit := 10
---
failures:

---- [codegen] tests/codegen/emcripten-catch-unwind.rs stdout ----

error: verification with 'FileCheck' failed
status: exit status: 1
command: "/usr/lib/llvm-18/bin/FileCheck" "--input-file" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/emcripten-catch-unwind/emcripten-catch-unwind.ll" "/checkout/tests/codegen/emcripten-catch-unwind.rs" "--check-prefix=CHECK" "--check-prefix" "NONMSVC" "--allow-unused-prefixes" "--dump-input-context" "100"
--- stderr -------------------------------
/checkout/tests/codegen/emcripten-catch-unwind.rs:49:12: error: CHECK: expected string not found in input
/checkout/tests/codegen/emcripten-catch-unwind.rs:49:12: error: CHECK: expected string not found in input
 // CHECK: [[ALLOCA:%.*]] = alloca
           ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/emcripten-catch-unwind/emcripten-catch-unwind.ll:14:7: note: scanning from here
      ^
      ^
/checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/emcripten-catch-unwind/emcripten-catch-unwind.ll:22:10: note: possible intended match here
 %catchpad2.i = catchpad within %catchswitch1.i [ptr null]

Input file: /checkout/obj/build/x86_64-unknown-linux-gnu/test/codegen/emcripten-catch-unwind/emcripten-catch-unwind.ll
Check file: /checkout/tests/codegen/emcripten-catch-unwind.rs


-dump-input=help explains the following input dump.
Input was:
<<<<<<
<<<<<<
            1: ; ModuleID = 'emcripten_catch_unwind.df8ae0c697b5edf5-cgu.0' 
            2: source_filename = "emcripten_catch_unwind.df8ae0c697b5edf5-cgu.0" 
            3: target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-f128:64-n32:64-S128-ni:1:10:20" 
            4: target triple = "wasm32-unknown-emscripten" 
            5:  
            6: ; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable 
            7: define noundef i32 @ptr_size() unnamed_addr #0 { 
            9:  ret i32 4 
           10: } 
           11:  
           12: ; Function Attrs: uwtable 
           12: ; Function Attrs: uwtable 
           13: define noundef i32 @test_catch_unwind(ptr nocapture noundef nonnull readonly %try_fn, ptr noundef %data, ptr nocapture noundef nonnull readonly %catch_fn) unnamed_addr #1 personality ptr @__gxx_wasm_personality_v0 { 
           14: start: 
check:49'0           X error: no match found
           15:  invoke void %try_fn(ptr %data) 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           16:  to label %__rust_try.exit unwind label %catchswitch.i 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           18: catchswitch.i: ; preds = %start 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           19:  %catchswitch1.i = catchswitch within none [label %catchpad.i] unwind to caller 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           21: catchpad.i: ; preds = %catchswitch.i 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           22:  %catchpad2.i = catchpad within %catchswitch1.i [ptr null] 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'1              ?                                                  possible intended match
           23:  %0 = tail call ptr @llvm.wasm.get.exception(token %catchpad2.i) 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           24:  %1 = tail call i32 @llvm.wasm.get.ehselector(token %catchpad2.i) 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           25:  call void %catch_fn(ptr %data, ptr %0) [ "funclet"(token %catchpad2.i) ] 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           26:  catchret from %catchpad2.i to label %__rust_try.exit 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           28: __rust_try.exit: ; preds = %start, %catchpad.i 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           29:  %common.ret.op.i = phi i32 [ 0, %start ], [ 1, %catchpad.i ] 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           30:  ret i32 %common.ret.op.i 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~
           31: } 
check:49'0     ~~
check:49'0     ~
check:49'0     ~
           33: declare i32 @__gxx_wasm_personality_v0(...) unnamed_addr #2 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           35: ; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           36: declare ptr @llvm.wasm.get.exception(token) #3 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           38: ; Function Attrs: mustprogress nocallback nofree nosync nounwind willreturn 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           39: declare i32 @llvm.wasm.get.ehselector(token) #3 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           41: attributes #0 = { mustprogress nofree norecurse nosync nounwind willreturn memory(none) uwtable "target-cpu"="generic" "target-features"="+exception-handling" } 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           42: attributes #1 = { uwtable "target-cpu"="generic" "target-features"="+exception-handling" } 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           43: attributes #2 = { "target-cpu"="generic" } 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           44: attributes #3 = { mustprogress nocallback nofree nosync nounwind willreturn } 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           46: !llvm.module.flags = !{!0} 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           47: !llvm.ident = !{!1} 
check:49'0     ~~~~~~~~~~~~~~~~~~~~
check:49'0     ~
check:49'0     ~
           49: !0 = !{i32 8, !"PIC Level", i32 2} 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           50: !1 = !{!"rustc version 1.84.0-nightly (c92ba44cb 2024-10-17)"} 
check:49'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
------------------------------------------



@workingjubilee
Copy link
Member

workingjubilee commented Oct 18, 2024

This is a draft because we need some additional setting for the Emscripten target to select between the old exception handling and the new exception handling.

Hm. I feel a little out-of-context here. Could you explain

  1. what makes the "old" exception-handling old?
  2. what makes the "new" exception-handling new?
  3. what parts of an emscripten program must agree on which exception handling we're using?
    • in the minimal case: linkage of a Rust main.rs, std, and Emscripten?
    • in other configurations of linkage that Emscripten supports?
  4. why we must have a toggle for this?

Is "new" meaning the wasm exception-handling proposal with exnref and wasm-level try blocks?

@workingjubilee workingjubilee added the O-emscripten Target: 50% off wasm32-unknown-musl. the savings come out of stdio.h, but hey, you get SDL! label Oct 21, 2024
@hoodmane
Copy link
Contributor Author

Thanks for the questions.

The short answer is these are two different ABIs. Perhaps it would be most accurate to think of wasm32-emscripten-old-eh and wasm32-emscripten-new-eh as distinct targets. This is similar to -sWASM_BIGINT. Unfortunately, Emscripten has tons of different ABIs. But some of the ABIs are better than others in terms of code speed, size, and bugginess.

What makes the "old" / "new" exception-handling old / new?

Well the older system has existed for a lot longer and worked with the first version of WebAssembly deployed in browsers. The new exception handling uses native support and has only been universally available in engines since ~March 2022.

what parts of an emscripten program must agree on which exception handling we're using?

We need different IR, different object files, and different linker arguments. But it's transparent to Rust/C++/C code.

in the minimal case: linkage of a Rust main.rs, std, and Emscripten?

Yeah so we need a distinct build of the standard library because the object files are different. The compilation of both main.rs and std need to change, namely by changing the implementation of the try intrinsic. Perhaps only libpanicunwind is affected and other libs that don't use the try intrinsic are unaffected, but it's unclear. Also, the linker arguments change.

why we must have a toggle for this?

Currently, Pyodide is stuck using the old ABI because Rust only supports the old ABI. If Rust switched to only supporting the new ABI, it would make the transition difficult.

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 21, 2024

To make matters worse, the new exception handling support was changed in October of 2023 so there are now two new exception handling variants:

  • the old new exception handling, which is universally supported but also deprecated, and
  • the new new exception handling, which is currently only supported in the most recent release of Firefox.

I think there is no difference in the IR the frontend generates between the old new and new new variants, though the lowering pass has distinct code for the two cases, the object file format is different, and so the new variant will require a different setting code generation time. It's a fairly simple transform to adjust an old object file to support the new variant and they plan "to provide a converter tool from the old binary to the new format to facilitate the use of the new instructions without the need for recompilation."

So we may end up with three ABIs.

@sbc100
Copy link

sbc100 commented Oct 21, 2024

I think there is no difference in the IR the frontend generates between the old new and new new variants

That is true today but the plan is to make llvm support the new variant natively ASAP. However at the at point I imagine there will be a lowering pass added to binaryen to support the previous version, so you should still only need a single object file format / libc build.

@sbc100
Copy link

sbc100 commented Oct 21, 2024

Unfortunately, Emscripten has tons of different ABIs.

While its true emscripten supports a lot of different options almost all of them are link-time only settings and don't effect the object file ABI. There are only subset of settings that effect the object format (i.e. that are vaslid at compile time): https://github.com/emscripten-core/emscripten/blob/10cb9d46cdd17e7a96de68137c9649d9a630fbc7/tools/settings.py#L81-L112. Note that WASM_BIGINT is not one of them.

@hoodmane
Copy link
Contributor Author

Are MAIN_MODULE/SIDE_MODULE really compile time settings? I only pass them in LDFLAGS...

@sbc100
Copy link

sbc100 commented Oct 21, 2024

Are MAIN_MODULE/SIDE_MODULE really compile time settings? I only pass them in LDFLAGS...

Only in the sense that they enable -fPIC.. you can use -fPIC itself instead if you prefer.

@workingjubilee
Copy link
Member

In general, "more_compiler_abi_flags == more_compatibility" is actually "more_compiler_flags == incompatibility", because you now have to have even more flags set to the correct values to compile code correctly.

So I would prefer that we not expose, nor even pretend that we wish to offer compatibility with, "tons of different ABIs". We already have quite enough problems with ABI-affecting flags. It does not really take many booleans to be able to count to a million, and a few million untested variants of Rust is not a thing we should be trying to support.

Having said that, supporting the new ABI behind an unstable flag that will then be ditched at some point has precedent, especially for wasm targets, see #117919

@workingjubilee
Copy link
Member

workingjubilee commented Oct 21, 2024

If a -Z flag for such a transition period is acceptable, @hoodmane, then like that flag did, opening an MCP, would be the next step: https://github.com/rust-lang/compiler-team/issues/new?template=major_change.md

While we Just Merged the soundness-fixing change of #131533 we usually do MCPs for Significant Changes To Preexisting Targets, especially ones that require this... shindig. A transition period and involvement of rustc's CLI.

They're usually sensible enough that it's a bit of a formality, so don't overthink it, the answers to my questions are more-or-less what you would paste in there, with a bit of additional stitching up to explain why e.g. Pyodide needs a flag for this transition and we don't want to do just a hard cut over.

@workingjubilee workingjubilee added the A-ABI Area: Concerning the application binary interface (ABI) label Oct 21, 2024
@hoodmane
Copy link
Contributor Author

do MCPs

Sounds good. Thanks again @workingjubilee for all of your help. I really appreciate it.

It does not really take many booleans to be able to count to a million, and a few million untested variants of Rust is not a thing we should be trying to support.

Yeah this is why I'm raising this as a concern. It seems clearly out of scope to support all of them. It's important to have a solution to this in mind ahead of time. I would prefer that Rust only supports the 2 configurations I care about and has real test coverage for them, but different projects will standardize on different options. It's a thorny issue coordination problem. It would be nice if Emscripten could formalize specific triples...

@hoodmane
Copy link
Contributor Author

hoodmane commented Oct 22, 2024

The fact that the ABIs can be linearly sorted by runtime support date is helpful. Instead of 20 different booleans leading to 2^20 different options toggling each independently, there are 21 options when we sort. Then if you only support:

  • the most recent date with a -Z flag
  • some option that you think is "old enough to be universally available"

maybe it's good enough. But having the in-between steps available as -Z flags would be helpful for debugging the target.

@hoodmane
Copy link
Contributor Author

Okay I opened a MCP:
rust-lang/compiler-team#801

@compiler-errors compiler-errors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 4, 2024
@RalfJung
Copy link
Member

RalfJung commented Dec 3, 2024

what parts of an emscripten program must agree on which exception handling we're using?

We need different IR, different object files, and different linker arguments. But it's transparent to Rust/C++/C code.

This does not answer the question: what happens when I link together crates that were compiled with different values for this flag? In particular, what if std was built with one value and the user uses a different value?

If the answer is Undefined Behavior, that must at least be very clearly documented as part of the flag docs, and that is a blocker for stabilizing this flag.

@RalfJung
Copy link
Member

RalfJung commented Dec 3, 2024

@hoodmane I also have a question regarding the plan outlined in the MCP (I'd prefer to ask this on Zulip but could not find your handle there):

  • Pyodide switches to using new ABI and starts passing the flag

Given that (IIUC) the flag you choose must match the flag used to build the standard library, how is this supposed to work?

@hoodmane
Copy link
Contributor Author

hoodmane commented Dec 3, 2024

I just signed up for Zulip. Thanks @RalfJung.

@workingjubilee
Copy link
Member

@hoodmane Remember to tag this with rustbot review when you update it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ABI Area: Concerning the application binary interface (ABI) O-emscripten Target: 50% off wasm32-unknown-musl. the savings come out of stdio.h, but hey, you get SDL! S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants