-
-
Notifications
You must be signed in to change notification settings - Fork 978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
overhead and workload invocation sequences diverge #2305
Comments
@AndyAyersMS Thanks for the bug report! I can confirm that it is a severe issue that affects all the benchmarks that return structs (except public struct MyStruct
{
public int Value;
}
public class Benchmarks
{
[Benchmark]
public MyStruct Foo() => new MyStruct();
} However, it's not clear to me how to properly fix this problem. Below are some of my thoughts on this. Firstly, we have two primary ways to consume the return value of the workload method: consumer.Consume(workloadDelegate().Value); // Workload:1
consumer.Consume(workloadDelegate()); // Workload:2 The first one ( Secondly, we have several ways to define the overhead delegate: // Overhead:1
private System.Int32 __Overhead()
{
return default(System.Int32);
}
// Overhead:2
private MyStruct __Overhead()
{
return new MyStruct();
}
// Overhead:3
private MyStruct value;
private System.Int32 __Overhead()
{
return value;
} Here are some comments about all of these options:
Since we don't know in advance how the given benchmark obtains the struct value, it's quite hard to provide a proper At the moment, I have only one idea of how to provide a proper baseline for benchmarks that return structs:
public struct OverheadStruct
{
public int Value;
}
private OverheadStruct __Overhead()
{
return new OverheadStruct();
} Since
public delegate OverheadStruct OverheadDelegate();
private void OverheadActionUnroll(System.Int64 invokeCount)
{
for (System.Int64 i = 0; i < invokeCount; i++)
{
consumer.Consume(overheadDelegate().Value);
public delegate System.Numerics.Vector3 WorkloadDelegate();
private void WorkloadActionUnroll(System.Int64 invokeCount)
{
for (System.Int64 i = 0; i < invokeCount; i++)
{
consumer.Consume(overheadDelegate().X); I don't feel like it is a perfect solution, but it should (seemingly) provide a more accurate baseline for benchmarks that return structs. @AndyAyersMS @adamsitnik What do you think? |
@AndreyAkinshin What about adding a |
Also, if you use |
@timcassell With such an approach, |
I would assume that any consume code would be replicated in overhead regardless, no? Anyway, I was thinking of simplifying it to this. public unsafe void Consume<T>(in T value)
=> ptrHolder = (IntPtr) Unsafe.AsPointer(ref Unsafe.AsRef(in value)); Then it won't matter how large the struct is, it's only grabbing its reference. What do you think? (current implementation passes it to |
And I think the overhead method can be changed to this: private MyStruct __Overhead()
{
Unsafe.SkipInit(out MyStruct value);
return value;
} That matches the workload type, and avoids the cost of field reading and constructor call and zero-initializing. |
Sounds good.
It's a great idea, I like it! Do you want to send a PR? |
Sure thing. |
Why is a consumer even needed here? Isn't it enough to just do a NoInline method call? |
Another thought is to not optimize these methods, that way the return value can be unconsumed but presumably would always still be produced. But it would make overhead higher, which is probably less desirable. |
I thought the same thing. There was a short discussion about that in #2173. |
If these don't match then benchmark times will be either under or over estimated.
Context dotnet/runtime#86033 (comment)
This is the "good" case where the jit isn't messing up the workload invoke codegen (which just makes the problem worse).
May be restricted to certain return types like Vector; I haven't seen this elsewhere.
The text was updated successfully, but these errors were encountered: