Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace_all failure? #699

Open
gilbo opened this issue Sep 4, 2024 · 2 comments
Open

replace_all failure? #699

gilbo opened this issue Sep 4, 2024 · 2 comments
Labels
C: Scheduling The scheduling language and APIs T: Bug Something isn't working

Comments

@gilbo
Copy link
Contributor

gilbo commented Sep 4, 2024

Consider the following code

def rank_k_reduce_6x16_scheduled(K: size, A: f32[6, K] @ DRAM,
                                 B: f32[K, 16] @ DRAM, C: f32[6, 16] @ DRAM):
    C_reg: f32[6, 4, 4] @ Neon
    for i0 in seq(0, 6):
        for jo in seq(0, 4):
            neon_vld_4xf32(C_reg[i0, jo, 0:4], C[i0, 4 * jo:4 + 4 * jo])
    for k in seq(0, K):
        B_reg: f32[4, 4] @ Neon
        for io in seq(0, 4):
            neon_vld_4xf32(B_reg[io, 0:4], B[k, 4 * io:4 + 4 * io])
        for i in seq(0, 6):
            A_reg: f32[4] @ Neon
            neon_broadcast_4xf32(A_reg[0:4], A[i + 0, k + 0:k + 1])
            for jo in seq(0, 4):
                for ji in seq(0, 4):
                    C_reg[i, jo, ji] += A_reg[ji] * B_reg[jo, ji]
    for i0 in seq(0, 6):
        for jo in seq(0, 4):
            neon_vst_4xf32(C[i0, 4 * jo:4 + 4 * jo], C_reg[i0, jo, 0:4])

and the following instruction

@instr("{dst_data} = vfmaq_f32({dst_data}, {lhs_data}, {rhs_data});")
def neon_vfmaq_4xf32_4xf32(
    dst: [f32][4] @ Neon, lhs: [f32][4] @ Neon, rhs: [f32][4] @ Neon
):
    assert stride(dst, 0) == 1
    assert stride(lhs, 0) == 1
    assert stride(rhs, 0) == 1

    for i in seq(0, 4):
        dst[i] += lhs[i] * rhs[i]

The scheduling operation replace(neon, neon.find_loop('ji'), neon_vfmaq_4xf32_4xf32) seems to result in a correct replacement. However, the scheduling operation neon = replace_all(neon, neon_vfmaq_4xf32_4xf32) does not correctly find the replacement.

This bug was produced on the PIP version of Exo.

@gilbo gilbo added T: Bug Something isn't working C: Scheduling The scheduling language and APIs labels Sep 4, 2024
@gilbo
Copy link
Contributor Author

gilbo commented Sep 4, 2024

This may be due to my trying to use a custom Neon user library and getting a name polution of two different memory classes, both named Neon.

@gilbo
Copy link
Contributor Author

gilbo commented Sep 4, 2024

Yes, I just discovered the problem. stdlib.inspection starts with the following import block

from exo import *
from exo.libs.memories import *
from exo.platforms.x86 import *
from exo.platforms.neon import *
from exo.syntax import *
from exo.API_cursors import *
from exo.stdlib.analysis import *

I will save mildly irate messages for Slack.

It is ok to close this bug, or divert it into another issue for factoring out the platforms libraries from the main Exo package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: Scheduling The scheduling language and APIs T: Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant