Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAOT] Linux/ARM bring-up (3/n) #97104

Merged
merged 19 commits into from
Jan 19, 2024
Merged

Conversation

filipnavara
Copy link
Member

@filipnavara filipnavara commented Jan 17, 2024

Some smoke tests pass:

Time [secs] | Total | Passed | Failed | Skipped | Assembly Execution Summary
============================================================================
      0.192 |     1 |      1 |      0 |       0 | nativeaot.CustomMain.XUnitWrapper.dll
      0.190 |     1 |      1 |      0 |       0 | nativeaot.GenerateUnmanagedEntryPoints.XUnitWrapper.dll
     24.809 |    15 |     12 |      3 |       0 | nativeaot.SmokeTests.XUnitWrapper.dll
      0.200 |     1 |      1 |      0 |       0 | nativeaot.StartupHook.XUnitWrapper.dll
----------------------------------------------------------------------------
     25.391 |    18 |     15 |      3 |       0 | (total)

To do:

  • Fix MUSL builds
  • Fix DWARF method lookup w.r.t. thumb bit
  • IMAGE_REL_BASED_THUMB_MOV32_PCREL relocation causing corrupted pointers after linking
  • GC / STW thread hijacking is unimplemented/broken
  • Exclude DwarfTest or ensure that llvm-dwarpdump is downloaded from NuGet to correct location
  • Investigate segmentation fault in UnitTests and DynamicGenerics smoke tests (uncatchable in QEMU debugger)

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jan 17, 2024
@ghost
Copy link

ghost commented Jan 17, 2024

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

Some smoke tests pass:

Time [secs] | Total | Passed | Failed | Skipped | Assembly Execution Summary
============================================================================
      0.180 |     1 |      1 |      0 |       0 | nativeaot.CustomMain.XUnitWrapper.dll
      0.183 |     1 |      1 |      0 |       0 | nativeaot.GenerateUnmanagedEntryPoints.XUnitWrapper.dll
      5.847 |    15 |     11 |      4 |       0 | nativeaot.SmokeTests.XUnitWrapper.dll
----------------------------------------------------------------------------
      6.210 |    17 |     13 |      4 |       0 | (total)

To do:

  • GC / STW thread hijacking is unimplemented/broken.
  • Dehydration sometimes results in corrupted method tables (shifted by 4 bytes). Curiously, it doesn't happen for -no-pie executables, or if the hydrated section is renamed to .bss.hydrated and merged by the linker into regular BSS. LLDB also handles the symbol name demangling in the regular BSS section but doesn't handle it for the custom section.
  • TBD
Author: filipnavara
Assignees: -
Labels:

community-contribution, area-NativeAOT-coreclr

Milestone: -

src/coreclr/CMakeLists.txt Outdated Show resolved Hide resolved
@filipnavara filipnavara force-pushed the naot-bringup-003 branch 2 times, most recently from 67eed4a to 57fb79c Compare January 17, 2024 18:59
@MichalStrehovsky
Copy link
Member

if the hydrated section is renamed to .bss.hydrated

We can place this into .bss section. I don't remember why I made it a separate section.

@filipnavara
Copy link
Member Author

We can place this into .bss section. I don't remember why I made it a separate section.

Generally speaking I think it's a good idea, at least for the improved LLDB debugging experience. I'd still like to understand the errors I am seeing though.

@filipnavara
Copy link
Member Author

filipnavara commented Jan 18, 2024

I'd still like to understand the errors I am seeing though.

I was wrong about the offset in dehydrated data being off by 4. It's actually off by 0x10000 in the linked executable. It uses the MOVW/MOVT instructions to load the address which each loads a 16-bit word. I'm following through with the investigation, but I am not fully ruling out a linker bug (*).

(*) Already found one related to the thumb bit (lowest bit) in MOVW/MOVT. For that one it's quite obvious how it happens and why no one hit it before - other compilers don't produce inline addend with Thumb bit. This one seems like a tougher nut to crack.

@filipnavara
Copy link
Member Author

I traced back another issue to the off-by-0x10000 relocation into the BSS sections. This time with frozen objects. :-/

@filipnavara
Copy link
Member Author

I'm now >90% sure that there's a bug in LLD with handling of the MOVW/MOVT relocations. I built the mold linker with eh_frame patch and it doesn't produce the corrupted binaries. For example, the Reflection_FromUsage smoke test successfully passed.

@filipnavara
Copy link
Member Author

filipnavara commented Jan 18, 2024

I found the issue that causes the data corruption. Relocation.PutThumb2Mov32 takes a 32-bit addend and writes the upper half of the addend to the MOVT relocation. That's not how the ARM ELF relocations work though. They expect the addend to fit within 16-bits and be the same for both the MOVW and MOVT relocation. If the relative offset is something like 0x1c000a and the addend is -12 then the MOVW part has the addend applied correctly but the MOVT part is not (inline addend in the instruction code is 0 instead of -12 and it ends up off-by-one).

@filipnavara
Copy link
Member Author

I don't remember why I made it a separate section.

I presume that the reason is to keep match the layout of the hydrated and dehydrated data in the node without interleaving of other things put into the BSS section. It makes sense to do that while producing the object file. We can tweak the final linked output to merge the sections but it's not really a big difference in the end. The concern I previously had about LLDB debugging experience was caused by the MOVT relocation issue, not the different section name.

@filipnavara filipnavara marked this pull request as ready for review January 19, 2024 10:04
@filipnavara filipnavara requested a review from jkotas January 19, 2024 10:04
eng/Subsets.props Outdated Show resolved Hide resolved
Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@jkotas jkotas merged commit 5a6135e into dotnet:main Jan 19, 2024
178 of 181 checks passed
@filipnavara filipnavara deleted the naot-bringup-003 branch January 19, 2024 17:22
tmds pushed a commit to tmds/runtime that referenced this pull request Jan 23, 2024
* Mask thumb bit when setting IP in a context (eg. palContext->SetIp((uintptr_t)&RhpThrowHwEx))

* Set ExInfo.m_pExContext in RhpThrowHwEx

* Remove REGDISPLAY.GetAddrOfIP and it's only usage

* Resolve ARM relocations (also workarounds LLD bug with thumb bit present both in addend and in symbol value)

* Remove REGDISPLAY.pIP/SetAddrOfIP

* Make most of the assembly code PIE compatible

* NativeAOT: Enable DFEATURE_64BIT_ALIGNMENT on linux-arm

* Enable NativeAOT linux-arm build

* Enable DWARF exception handling for linux-arm

* Fix UnwindFuncletInvokeThunk to skip over r2 register saved on stack by RhpCallFilterFunclet

* Fix signature of P/Invoke native code in SafeHandleTest to match the managed one

* Implement missing StackFrameIterator::InternalInit code for ARM

* Fix thumb bit masking in InInterfaceDispatchHelper

* Workaround: Ensure the Thumb bit is set when looking up method info in DWARF. We would fail to lookup methods at their first instruction otherwise.

* Correctly convert the addend for IMAGE_REL_BASED_THUMB_MOV32[_PCREL] into the ELF relocations

* Simplify CMake condition

* Simplify NativeAotSupported conditions

* Fix typo

* Fix IMAGE_REL_BASED_THUMB_MOV32 conversion to ELF
@github-actions github-actions bot locked and limited conversation to collaborators Mar 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm32 area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants