Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake: build libraries failed #68

Closed
trimpim opened this issue Nov 9, 2023 · 30 comments
Closed

cmake: build libraries failed #68

trimpim opened this issue Nov 9, 2023 · 30 comments
Labels
fixed Issue is resolved

Comments

@trimpim
Copy link
Contributor

trimpim commented Nov 9, 2023

When I try to run our waasmedge project with either 23.04 or 23.10 I get the following error:

[init -> wasmedge -> wasmedge] Error: LD: symbol not found: '__gcc_personality_v0'
[init -> wasmedge -> wasmedge] Error: Uncaught exception of type 'Linker::Not_found'
[init -> wasmedge -> wasmedge] Warning: abort called - thread: ep

This uses:

  • libunwind, libcxx-abi and libcxx from llvm
  • spdlog
  • wasmedge
    These are all built using goa.

2ec360c resolves the issue for me, but as I'm no expert on linking this might be far from optimal.

This might be related to #66 and the errors I encountered in genodelabs/genode-world#342

@jschlatow
Copy link
Member

I'm no expert on linking either, yet, since __gcc_personality_v0 is part of ld.lib.so shouldn't it suffice to add the whole-archive thing here:

lappend ldlibs_exe -Wl,--dynamic-linker=ld.lib.so

@ssumpf What's your take on this?

@chelmuth
Copy link
Member

chelmuth commented Nov 10, 2023

Each time the silver bullet whole archive is mentioned it feels worse to swallow that pill because I did not get the rationale yet.

trimpim added a commit to trimpim/goa that referenced this issue Nov 10, 2023
@trimpim
Copy link
Contributor Author

trimpim commented Nov 10, 2023

@jschlatow f256f01 also fixes the problem for me.

This hopefully will not break #60. The libraries I get are < 2MB.

@ssumpf
Copy link
Member

ssumpf commented Nov 10, 2023

Each time the silver bullet whole archive is mentioned it feels worse to swallow that pill because I did not get the rationale yet.

The rational is everything that is .global hidden in a library will get .local after linking, and thus, inaccessible for the dynamic linker. When one, for example, links libgcc.a without whole archive against a shared library, the static linker will find libgcc symbols used by the shared library and create jump slot relocations for these symbols. Hence when the library actually calls these symbols the dynamic linker will not find them.

@ssumpf
Copy link
Member

ssumpf commented Nov 10, 2023

@trimpim, @jschlatow: I will have a look into this. @trimpim: Can I reproduce your scenario using https://github.com/trimpim/wasmedge-genode ? There doesn't seem to be a pkg.

@trimpim
Copy link
Contributor Author

trimpim commented Nov 10, 2023

@ssumpf I just have pushed the branch test-231110. This contains the pkg and a small README.md. With this you should be able to reproduce it.

To simplify your live, I suggest you also take the fix form #67 if you are using make 4.4.

@ssumpf
Copy link
Member

ssumpf commented Nov 10, 2023

@ssumpf I just have pushed the branch test-231110. This contains the pkg and a small README.md. With this you should be able to reproduce it.

To simplify your live, I suggest you also take the fix form #67 if you are using make 4.4.

@trimpim: Thanks, where do I find the llvm API that's set in used_apis?

@ssumpf
Copy link
Member

ssumpf commented Nov 10, 2023

@trimpim: Nevermind found it.

@ssumpf
Copy link
Member

ssumpf commented Nov 10, 2023

@trimpim: You need to change the 08-use_gcc_eh.patch of wasmedge from

   target_link_libraries(wasmedge_shared
     PRIVATE
     wasmedgeCAPI
+    # https://forums.developer.nvidia.com/t/undefined-reference-to-gcc-personality-v0/131127/3
+    gcc_eh
   )

to

   target_link_libraries(wasmedge_shared
     PRIVATE
     wasmedgeCAPI
+    # https://forums.developer.nvidia.com/t/undefined-reference-to-gcc-personality-v0/131127/3
+    -Wl,--whole-archive -Wl,-lgcc_eh -Wl,--no-whole-archive
   )

because __gcc_personality_v0 is part of libgcc_eh.a and not our dynamic linker which provides __gxx_personality_v0 only. You need to do the whole archive thing because the symbol is global .hidden, as described for libgcc above.

In the meantime I will try to clean up the libgcc and whole-archive chaos a little and look that things do not break for you.

ssumpf added a commit to ssumpf/goa that referenced this issue Nov 10, 2023
* add libgcc as whole-archive to 'ldlibs_so' in flags.tcl
* remove unnecessary 'whole-archive' entries from qmake.tcl and cmake.tcl
* remove unnecessary and unused globals of 'ldlibs_so'

issue genodelabs#68
@ssumpf
Copy link
Member

ssumpf commented Nov 10, 2023

@jschlatow: The issue for wasmedge can be fixed by adjusting a patch in the project. Otherwise, I tried to clean up the libgcc.a and whole-archive problem with commit feea13c.

@chelmuth
Copy link
Member

because __gcc_personality_v0 is part of libgcc_eh.a and not our dynamic linker which provides __gxx_personality_v0 only. You need to do the whole archive thing because the symbol is global .hidden, as described for libgcc above.

As we use libgcc_eh (and libsupc++) in a creative way in cxx.mk it is literally part of the linker but suffers from the global-hidden condition too (if I understand correctly). Would it make sense to review symbols/ld and the linking of ld.lib.so to provide more symbols of the runtime that are currently missing to solve the issues we address here?

@trimpim
Copy link
Contributor Author

trimpim commented Nov 13, 2023

@ssumpf thanks for the fix.

With this change and your patch build and run of wasmedge works for me.

jschlatow pushed a commit that referenced this issue Nov 16, 2023
* add libgcc as whole-archive to 'ldlibs_so' in flags.tcl
* remove unnecessary 'whole-archive' entries from qmake.tcl and cmake.tcl
* remove unnecessary and unused globals of 'ldlibs_so'

issue #68
@jschlatow jschlatow added the fixed Issue is resolved label Nov 16, 2023
ssumpf added a commit to ssumpf/goa that referenced this issue Nov 20, 2023
* move the '-lgcc' from ldlib_so to ldlibs_common and use ldlibs_common
  also for qmake (as done by the other build systems)
* remove unnecessary libgcc locating code for qmake

issue genodelabs#68
@ssumpf
Copy link
Member

ssumpf commented Nov 20, 2023

@jschlatow: ce9b91b tries to improve upon f219ab3 by removing the unnecessary detection of libgcc from the qmake support and move libgcc to ldlib_common for all builds.

@jschlatow
Copy link
Member

@ssumpf Unfortunately, ce9b91b breaks examples/hello_rust.

@nfeske
Copy link
Member

nfeske commented Nov 22, 2023

Autoconf apparently also suffers. (experienced while attempting to port gforth on Linux/ARM64 on my MNT-Reform)

With the common, the basic compile test fails because all symbols of libgcc end up in the binary twice. This is probably because ldlibs_common is passed to configure as both LDLIBS and LIBS. When just specifying -lgcc, this is no problem because one lib can appear any number of times using the -l argument w/o causing multiple symbol definitions. But the whole-archive option seems to force the linker to squeeze all symbols of the lib into the binary. If specified twice, the symbols are added twice, ending up at an "double defined symbols" error.

I sense that the wrapping of -lgcc in a whole-archive block is not what we generally want.

@nfeske
Copy link
Member

nfeske commented Nov 22, 2023

@ssumpf I also noticed that you removed the -nostdlib option. This is not good because without this option, a bunch of compiler heuristics kick in, which we don't want.

@ssumpf
Copy link
Member

ssumpf commented Nov 22, 2023

@ssumpf I also noticed that you removed the -nostdlib option. This is not good because without this option, a bunch of compiler heuristics kick in, which we don't want.

As far as I understand it, this is covered by
-nostartfiles -nodefaultlibs -static-libgcc in ldlibs_common we could change that to -nostdlib and make it the same for everyone.

@ssumpf
Copy link
Member

ssumpf commented Nov 22, 2023

Autoconf apparently also suffers. (experienced while attempting to port gforth on Linux/ARM64 on my MNT-Reform)

With the common, the basic compile test fails because all symbols of libgcc end up in the binary twice. This is probably because ldlibs_common is passed to configure as both LDLIBS and LIBS. When just specifying -lgcc, this is no problem because one lib can appear any number of times using the -l argument w/o causing multiple symbol definitions. But the whole-archive option seems to force the linker to squeeze all symbols of the lib into the binary. If specified twice, the symbols are added twice, ending up at an "double defined symbols" error.

I sense that the wrapping of -lgcc in a whole-archive block is not what we generally want.

Okay, this one is new to me. In this case we want -lgcc for all binaries and the whole-archive for shared libraries only.

@nfeske
Copy link
Member

nfeske commented Nov 23, 2023

As far as I understand it, this is covered by -nostartfiles -nodefaultlibs -static-libgcc

That's true - at least that was the rationale of 9dcadf7. It seems that I missed adapting qmake.tcl in this respect. So it's good to remove this option. Could you do this in a separate commit?

ssumpf added a commit to ssumpf/goa that referenced this issue Nov 23, 2023
* add the '-lgcc' to ldlibs_common and use ldlibs_common
  also for qmake (as done by the other build systems)
* remove libgcc locating code for qmake

issue genodelabs#68
ssumpf added a commit to ssumpf/goa that referenced this issue Nov 23, 2023
This is already covered by 'ldlibs_common'

issue genodelabs#68
@ssumpf
Copy link
Member

ssumpf commented Nov 23, 2023

I have added 39755ba to remove -nostdlib from qmake.tcl and made adjustments to use ldlibs_common for Qt5 apps as well (9f94761). With this all the tests (including Rust), the Linphone-SDK, (with a minor build-system check tweak for arm_v8a), my Qt5 scenarios, and wasmedge are working for me.

@ssumpf
Copy link
Member

ssumpf commented Nov 23, 2023

P.S. This also resolves the hello_make static constructor problem.

@chelmuth
Copy link
Member

As we are again orbiting around whole-archive I took yesterday afternoon to get an idea of the actual situation and how several statements of the past fit into this picture.

  1. GCC's -static-libgcc is ineffective in our configuration as -nodefaultlibs disables the desired libgcc magic. From the manpage: Only the libraries you specify are passed to the linker, and options specifying linkage of the system libraries, such as -static-libgcc or -shared-libgcc, are ignored.
  2. The size of libgcc is significant. In examples/cmake_library. libforty_two.lib.so increases from 14928 to 347856 bytes with --whole-archive. This overhead applies to all binaries incl. shared libs.
  3. I could not find any local or hidden wizardry in the shared object link. From my investigation, it's just the ancient plain rule of linker command lines that applies here: missing symbols are resolved from the remainder of the arguments to the right (or inside --start-group .. --end-group). So, if -lgcc is forced to the end of the linker command line, it just works.

After the investigation I patched examples for the test ecf97e0 and sketched solutions for common flags e3f8557 as well as cmake 459308b. My question is now: Can we walk this road and, thus, wipe some myths and legends associated to this topic?

@chelmuth
Copy link
Member

P.S. This also resolves the hello_make static constructor problem.

Could you please tell us the nature of the problem? Which constructor was not called?

Also, 9f94761 changes flags.tcl en-passant but the commit message suggests changes (and effects) to qmake only, while all build systems are affected.

@jschlatow jschlatow removed the fixed Issue is resolved label Nov 24, 2023
@ssumpf
Copy link
Member

ssumpf commented Nov 24, 2023

As we are again orbiting around whole-archive I took yesterday afternoon to get an idea of the actual situation and how several statements of the past fit into this picture.

1. GCC's `-static-libgcc` is ineffective in our configuration as `-nodefaultlibs` disables the desired libgcc magic. From the manpage: _Only the libraries you specify are passed to the linker, and options specifying linkage of the system libraries, such as -static-libgcc or -shared-libgcc, are ignored._

2. The size of libgcc is significant. In _examples/cmake_library_. _libforty_two.lib.so_ increases from 14928 to 347856 bytes with `--whole-archive`. This overhead applies to _all_ binaries incl. shared libs.

3. I could not find any _local_ or _hidden_ wizardry in the shared object link. From my investigation, it's just the ancient plain rule of linker command lines that applies here: missing symbols are resolved from the remainder of the arguments to the right (or inside --start-group .. --end-group). So, if `-lgcc` is forced to the end of the linker command line, it just works.

After the investigation I patched examples for the test ecf97e0 and sketched solutions for common flags e3f8557 as well as cmake 459308b. My question is now: Can we walk this road and, thus, wipe some myths and legends associated to this topic?

@chelmuth: I have tried your branch and it seems to work well in most cases. qt5_quicktest does not link for arm_v8a (undefined reference to __aarch64_ldadd4_acq_rel), linphone-simple produces the same undefined reference in the library for the libservicecontrolplugin.lib.so. This can quickly be reproduced using my goa-projects (https://github.com/ssumpf/goa-projects - master branch). Note this has always been a problem on arm_v8a only, I never saw it on x86.

@ssumpf
Copy link
Member

ssumpf commented Nov 24, 2023

@chelmuth: I will try to get -lgcc to the end of the linker command line for qmake next.

@ssumpf
Copy link
Member

ssumpf commented Nov 24, 2023

@chelmuth: Okay, -lgcc at the end of the linking command by hand works like a charm! Learned some ancient knowledge today ;) The only question that remains is how to convince the Qt5 build system to do so? @cproc: Do you have any suggestions?

@cproc
Copy link
Member

cproc commented Nov 24, 2023

It looks like GENODE_QMAKE_LIBS needs to be set as well like in https://github.com/genodelabs/genode/blob/master/repos/libports/lib/import/import-qt5_qmake.mk.

@ssumpf
Copy link
Member

ssumpf commented Nov 24, 2023

It looks like GENODE_QMAKE_LIBS needs to be set as well like in https://github.com/genodelabs/genode/blob/master/repos/libports/lib/import/import-qt5_qmake.mk.

@cproc: Yes this does the trick 👍

ssumpf pushed a commit to ssumpf/goa that referenced this issue Nov 24, 2023
* remove '-static-gcc'from ldlibs_common as '-nodefaultlibs/ disables the
  desired liggcc
* remove '-lgcc' from whole archive of ldlibs_so because it is not
  required when '-lgcc' s *always* at the end of the linking command

issue genodelabs#68
ssumpf pushed a commit to ssumpf/goa that referenced this issue Nov 24, 2023
ssumpf added a commit to ssumpf/goa that referenced this issue Nov 24, 2023
We use the GENODE_QMAKE_LIBS variable to achieve this.

issue genodelabs#68
ssumpf added a commit to ssumpf/goa that referenced this issue Nov 24, 2023
This is covered by ldlibs_common

issue genodelabs#68
@ssumpf
Copy link
Member

ssumpf commented Nov 24, 2023

@jschlatow: The commits above (my staging branch) are hopefully the last ones regarding this issue. Thanks to our combined knowledge I am pretty happy with this solution and everything works as expected.

@ssumpf ssumpf added the fixed Issue is resolved label Nov 24, 2023
jschlatow pushed a commit that referenced this issue Nov 24, 2023
* remove '-static-gcc'from ldlibs_common as '-nodefaultlibs/ disables the
  desired liggcc
* remove '-lgcc' from whole archive of ldlibs_so because it is not
  required when '-lgcc' is *always* at the end of the linking command

issue #68
jschlatow pushed a commit that referenced this issue Nov 24, 2023
* remove unnecessary 'whole-archive' entries from qmake.tcl and cmake.tcl
* remove unnecessary and unused globals of 'ldlibs_so'

issue #68
jschlatow pushed a commit that referenced this issue Nov 24, 2023
We use the GENODE_QMAKE_LIBS variable to achieve this.

issue #68
jschlatow pushed a commit that referenced this issue Nov 24, 2023
This is covered by ldlibs_common

issue #68
@jschlatow
Copy link
Member

Thanks for the collaborative effort. I'm also happy with the result.

I merged the commits and force-pushed to staging to eliminate commit 81ced02.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Issue is resolved
Projects
None yet
Development

No branches or pull requests

6 participants