You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
def test_blah():
@proc
def foo():
x: f32[8, 3] @ AVX2
y: f32[8, 3] @ DRAM
for c in seq(0, 3):
for i in seq(0, 8):
x[i, c] = y[i, c]
foo = replace_all(foo, mm256_loadu_ps)
print(foo)
We should not be able to replace the innermost loop with the vector instruction because the innermost dimension is the channels c, and the stride of i is 3. However, this operation does succeed.
I suspect that the issue is that Unification isn't considering which dimension it unified against (i), and is just directly checking if each of the buffer arguments (x and y) have a 0th-dimension stride of 1, which passes. However, it should really be checking if the dimension that i corresponds to has a stride of 1 (which it doesn't).
The text was updated successfully, but these errors were encountered:
One option that can potentially have false-negatives is to check the asserts post-unification. I believe the front-end currently has checks for such things so those can be refactored.
Consider the following test case:
We should not be able to replace the innermost loop with the vector instruction because the innermost dimension is the channels
c
, and the stride ofi
is 3. However, this operation does succeed.I suspect that the issue is that Unification isn't considering which dimension it unified against (
i
), and is just directly checking if each of the buffer arguments (x
andy
) have a 0th-dimension stride of 1, which passes. However, it should really be checking if the dimension thati
corresponds to has a stride of 1 (which it doesn't).The text was updated successfully, but these errors were encountered: