vector matrix multiplication in nnue #5550
Replies: 5 comments 1 reply
-
Check no SIMD implementation: std::memcpy(output, biases, sizeof(OutputType) * OutputDimensions);
// Traverse weights in transpose order to take advantage of input sparsity
for (IndexType i = 0; i < InputDimensions; ++i)
{
const InputType in = input[i];
if (in)
{
const WeightType* w = &weights[i];
for (IndexType j = 0; j < OutputDimensions; ++j)
output[j] += w[j * PaddedInputDimensions] * in;
}
} It doesn't matter whether the array is defined as If the top-leftmost 4x4 of weight matrix is identical and bias is zero, the result is |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot. I read your answer as: Ok, since design is correct it's time to implement the mutliplication. Notice this implementation gives the expected output
And this impl uses m256_add_dpbusd_epi32 like your impl. And if i try to get your impl i end up with
But this produces nonsence So i still missed something very important. |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot.
Is there an explaination for this? Looked at the scambled matirx too - looks just wiered :-) But it did help! |
Beta Was this translation helpful? Give feedback.
-
So i have a working toy-impl now. I noticed that the size of the output is now limited by simd size. Correct?
This impl is dense. So i want a sparse impl too.
As you can see there is no Nevertheless the time dropped from 4µs to 1µs (very sparse input data). If my impl does well, why you have this 'find_nnz' stuff and more important the nnz array and the 2 loops (1x input + 1x nnz)? Did i still miss something? |
Beta Was this translation helpful? Give feedback.
-
Yes I did. I should just copy/paste. |
Beta Was this translation helpful? Give feedback.
-
I still try to understand affine transformation in nnue.
I focus on vector matrix multiplication.
No bias, no sparse - to keep it most simple.
Is this design correct (weight[M][N] vs weight[N][M]) ??
What is the expected outcome? Is it
output = 1,2,3,4,0,0,0,0,0,0,0,0,0,0,0,0
?Beta Was this translation helpful? Give feedback.
All reactions