Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[library_dylink.js] How to link a Rust side module with C++ main module: missing invoke_ functions in proxyHandler #22906

Open
carlopi opened this issue Nov 11, 2024 · 10 comments

Comments

@carlopi
Copy link

carlopi commented Nov 11, 2024

Please include the following in your bug report:

Version of emscripten/emsdk:
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.68 (ceee49d)
clang version 20.0.0git (https:/github.com/llvm/llvm-project 5cc64bf60bc04b9315de3c679eb753de4d554a8a)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/carlo/emsdk/upstream/bin

The situation:
I work on duckdb-wasm, a C++ based codebase then compiled to Wasm via Emscripten.
To allow to extend the surface of the project, we allow, both in native and Wasm, to add code at run-time via extensions.

In the case of Wasm this means that main project comes with a JS and Wasm module, extension itself is a Wasm module, and via Emscripten's implementation of dlopen import / exports are remapped dynamically and then execution continues with additional functionality.

This works end to end for C++ extensions, where on performing dlopen imports are correctly remapped in the JS layer and stuff works as expected.

Basic is like:

SELECT st_area('POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))'::geometry);
Catalog Error: Scalar Function with name "st_area" is not in the catalog, but it exists in the spatial extension.

LOAD spatial;
SELECT st_area('POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))'::geometry);
┌─────────────────────────────────────────────────────────────────┐
│ st_area(CAST('POLYGON((0 0, 0 1, 1 1, 1 0, 0 0))' AS geometry)) │
╞═════════════════════════════════════════════════════════════════╡
│                                                             1.0 │
└─────────────────────────────────────────────────────────────────┘

(live demo at https://shell.duckdb.org/#queries=v0,SELECT-st_area('POLYGON((0-0%2C-0-1%2C-1-1%2C-1-0%2C-0-0))'%3A%3Ageometry)~,LOAD-spatial~,SELECT-st_area('POLYGON((0-0%2C-0-1%2C-1-1%2C-1-0%2C-0-0))'%3A%3Ageometry)~)

Now I am looking to do the same, but using Rust-based code, compiled with cargo with target wasm32-unknown-emscripten.

The problem
Rust code compiles, but on dlopen there are some missing symbols errors that are raised by code that looks like to come from https://github.com/emscripten-core/emscripten/blob/main/src/library_dylink.js#L699 that is like:

                    var proxyHandler = {
                        get(stubs, prop) {
                            switch (prop) {
                                case "__memory_base":
                                    return memoryBase;
                                case "__table_base":
                                    return tableBase
                            }
                            if (prop in wasmImports && !wasmImports[prop].stub) {
                                return wasmImports[prop]
                            }
                            if (!(prop in stubs)) {
                                var resolved;
                                stubs[prop] = (...args) => {
                                    resolved ||= resolveSymbol(prop);
                                    return resolved(...args)
                                }
                            }
                            return stubs[prop]
                        }
                    };

Error is that invoke_viii, or invoke_vii or similarly named functions are not present at the JavasScript level.

The hack
Adding a conditional like:

                    var proxyHandler = {
                        get(stubs, prop) {
+                          if (prop.startsWith("invoke_")) {
+                              return createDyncallWrapper(prop.substring(7));
+                          }
                            switch (prop) {
                                case "__memory_base":

solves the problem, and shows that basically the class of invoke_-functions are special in the fact that their code can be reconstructed starting from the signature to perform the correct indirect call.

This workaround is very brittle, I am looking for a more proper solution / directions / material on how this can be properly supported.

@kripken
Copy link
Member

kripken commented Nov 11, 2024

@hoodmane @sbc100 would know the proper answer, but a workaround is to use Wasm Exceptions, -fwasm-exceptions. That would remove the need for the invoke functions entirely.

@hoodmane
Copy link
Collaborator

Rust doesn't support -fwasm-exceptions at all unfortunately. Cf rust-lang/rust#131830

@hoodmane
Copy link
Collaborator

@carlopi I've encountered this problem before and your workaround looks pretty similar to what I did. It would also fix it to compile with -Cunwind=abort if you don't need to catch panics.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 11, 2024

Funnily enough I just submitted a change to remove createDyncallWrapper completely: #22825.

All the invoke_xx functions needed by a given module should be contained within the module itself (this is true for both the main module and the side module).

There was an issue where these functions were not being correctly added to the global list when libraries were loaded with RTLD_LOCAL. However, that was fixed in #22625, which was released as part of 3.1.68. So upgrading to 3.1.68 or above I would hope would fix this issue.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 11, 2024

Or wait, I see you are on 3.1.68 already.

Can you confirm, the which invoke_xx symbols are missing and which dynCall_xx exports are present in the side module you are trying to load?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 11, 2024

Imports that start with invoke_ should be implement via createInvokeFunction which is called from resolveGlobalSymbol:

else if (symName.startsWith('invoke_')) {
// Create (and cache) new invoke_ functions on demand.
sym = wasmImports[symName] = createInvokeFunction(symName.split('_')[1]);
}
. Can you see why this code is not executing in your case?

@carlopi
Copy link
Author

carlopi commented Nov 12, 2024

@carlopi I've encountered this problem before and your workaround looks pretty similar to what I did. It would also fix it to compile with -Cunwind=abort if you don't need to catch panics.

I do need to be able to catch panics, but I am not sure this is connected to exceptional behaviour, this looks to me it's about regular handling of indirect calls.

@sbc100:
Taking for example the module at https://community-extensions.duckdb.org/v1.1.3/wasm_eh/lindel.duckdb_extension.wasm, it has 11 invoke_ and 3 dynCall_:

% grep "import.*invoke_" lindel.wat                                                       
  (import "env" "invoke_vi" (func (;56;) (type 3)))
  (import "env" "invoke_vii" (func (;59;) (type 5)))
  (import "env" "invoke_viii" (func (;61;) (type 8)))
  (import "env" "invoke_iiii" (func (;62;) (type 19)))
  (import "env" "invoke_viiiii" (func (;63;) (type 13)))
  (import "env" "invoke_ii" (func (;64;) (type 1)))
  (import "env" "invoke_v" (func (;73;) (type 2)))
  (import "env" "invoke_iiiiii" (func (;83;) (type 16)))
  (import "env" "invoke_vij" (func (;122;) (type 8)))
  (import "env" "invoke_vijj" (func (;123;) (type 13)))
  (import "env" "invoke_ji" (func (;124;) (type 1)))
% grep "export.*dynCa" lindel.wat
  (export "dynCall_vij" (func 475))
  (export "dynCall_vijj" (func 476))
  (export "dynCall_ji" (func 477))

And I get into proxyHandler on an usage of invoke_ji that can't find the symbol.

Also strange that I see this problem after dlopen completed, but while executing regular code.

Thanks a lot for the help obviously.

I do have also another concern: are Emscripten versions expected to be compatible cross versions?

My problem is that main module and side modules are built at different point in times, machines / setups, and it's not obvious they are on the same Emscripten version. Should I enforce that, or there is any way to get, given 2 Emscripten versions, whether they are ABI compatible? Or there is no such a guarantee? Is this in the roadmap?

@sbc100
Copy link
Collaborator

sbc100 commented Nov 12, 2024

My problem is that main module and side modules are built at different point in times, machines / setups, and it's not obvious they are on the same Emscripten version. Should I enforce that, or there is any way to get, given 2 Emscripten versions, whether they are ABI compatible? Or there is no such a guarantee? Is this in the roadmap?

Sadly we don't currently offer guarantees about ABI compatibility so its best if you can build the main module and side modules with the same version of emscripten. By the way does the problem here go away if you do that? i.e. is this bug actually about ABI incompatibility between modules built with different versions of emscripten? Or is there something else here too?

In terms of ABI compatibility we do hope that breakages are rare, and we do help to one day make stronger guarantees about this.

@carlopi
Copy link
Author

carlopi commented Nov 12, 2024

@sbc100: Problem I encountered in local development, so same version of emscripten both to build main wasm module & accompanying JavaScript and side rust module.

@sbc100
Copy link
Collaborator

sbc100 commented Nov 12, 2024

And I get into proxyHandler on an usage of invoke_ji that can't find the symbol.

Can you verify if resolveGlobalSymbol is being called for invoke_ji and if that in turn is calling createInvokeFunction? Is createInvokeFunction even part of that output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants