Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC-report all associated LoaderAllocators #110856

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

tomeksowi
Copy link
Contributor

Fixes intermittent (about 1 fail in 100 runs) assert failure (refCollectionObject != NULL) in Interop/ICustomMarshaler/ConflictingNames/MultipleALCs with GCStress=0x3 on RISC-V. Symptom same as #47696 (abandoned because the issue relented spontaneously, the root cause was not found).

I did some more digging and it always failed when walking the following stack trace:

System.SpanHelpers.SequenceEqual    <----- assertion fails here
Filter.Match
MemberInfoCache`1[__Canon].PopulateMethods
MemberInfoCache`1[__Canon].GetListByName
MemberInfoCache`1[__Canon].Populate
System.RuntimeType.GetMethodCandidates
System.RuntimeType.GetMethodImplCommon
RunInALC.Run
RunInALC.ConflictingCustomMarshalerNamesInNoncollectibleLoadContexts_Succeeds
RunInALC.TestEntryPoint

The problem was after ConflictingCustomMarshalerNamesInCollectibleLoadContexts_Succeeds() the LoaderAllocator's weak ref to its object was severed but LoaderAllocator was still associated with the loaded image memory range. Then in ConflictingCustomMarshalerNamesInNoncollectibleLoadContexts_Succeeds() the same file is loaded and another LoaderAllocator got associated with the reused image. The refs passed to SequenceEqual pointed to a method name whose bytes were in the loaded image. During GC in PromoteCarefully LoaderAllocator::GetAssociatedLoaderAllocator_Unsafe returned the first associated LoaderAllocator with the severed weak ref, which tripped the assertion.

I didn't find anything to prevent many LoaderAllocators being associated with the same memory range so I'm assuming it's valid state. The fix is to report all associated LoaderAllocators and assert if at least one of them has a valid ref to the exposed object.

Part of #84834, cc @dotnet/samsung @mangod9 @janvorli

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Dec 20, 2024
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@clamp03 clamp03 added arch-riscv Related to the RISC-V architecture and removed arch-riscv Related to the RISC-V architecture labels Dec 20, 2024
@risc-vv
Copy link

risc-vv commented Dec 20, 2024

27573ee is being scheduled for building and testing

GIT: 27573ee4a2110113adb028a8347845071f30eb38
REPO: dotnet/runtime
BRANCH: main

Release-CLR-build FAILED

buildinfo.json

Compilation failed during coreclr-tests build```

</details>

@risc-vv
Copy link

risc-vv commented Dec 20, 2024

27573ee is being scheduled for building and testing

GIT: 27573ee4a2110113adb028a8347845071f30eb38
REPO: tomeksowi/runtime
BRANCH: gc-report-all-LoaderAllocators

1 similar comment
@risc-vv
Copy link

risc-vv commented Dec 20, 2024

27573ee is being scheduled for building and testing

GIT: 27573ee4a2110113adb028a8347845071f30eb38
REPO: tomeksowi/runtime
BRANCH: gc-report-all-LoaderAllocators

@risc-vv
Copy link

risc-vv commented Dec 20, 2024

27573ee is being scheduled for building and testing

GIT: 27573ee4a2110113adb028a8347845071f30eb38
REPO: tomeksowi/runtime
BRANCH: gc-report-all-LoaderAllocators

Release-CLR-build FAILED

buildinfo.json

Compilation failed during coreclr-tests build```

</details>

@janvorli
Copy link
Member

I believe the bug is actually in reusing the image for a different loader allocator. We will need to fix that instead of the symptoms.

@tomeksowi
Copy link
Contributor Author

I believe the bug is actually in reusing the image for a different loader allocator. We will need to fix that instead of the symptoms.

I can have a look. To be exact, the fix would be suppressing image reuse for a different allocator or keeping the same allocator for reused images?

In general, is multiple loader allocators per memory range valid? It's worth asserting if not.

@jkotas
Copy link
Member

jkotas commented Dec 20, 2024

I believe the bug is actually in reusing the image for a different loader allocator.

We do maintain one mapping per file path. It is how we end up reusing same image for different loader allocators.

If we change to multiple mappings per file path, we will end up with different set of issues - for example, we won't be able to use LoadLibrary to map images on Windows anymore.

@janvorli
Copy link
Member

I got mislead, I have forgotten that we don't allow loading R2R stuff into ALCs, so I have thought that the issue is about sharing the actual machine code, which we should not share. We share the IL stuff though, as @jkotas mentioned.

@janvorli
Copy link
Member

Looking into it more - I have not remembered that we have added handling of RVA fields in collectible assemblies couple of years ago. That's why we record the ranges of collectible assemblies images to associate them with loader allocator and we can have multiple loader allocators associated with the range of the same image for this purpose.

@janvorli
Copy link
Member

The fix looks correct then. I am sorry for the confusion.

@tomeksowi
Copy link
Contributor Author

Release-CLR-build FAILED

Infrastructure failure, I'll undraft when tests pass locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants