Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix smem swizzle for matmul #588

Merged
merged 2 commits into from
Jul 13, 2023
Merged

Fix smem swizzle for matmul #588

merged 2 commits into from
Jul 13, 2023

Conversation

zasdfgbnm
Copy link
Collaborator

@zasdfgbnm zasdfgbnm commented Jul 13, 2023

In #387, we have to do a

    int64_t swizzle_period =
        std::gcd(n_rows / repeated_pattern_size, tile_size_y / n_cols);

in order to make our swizzling algorithm work for epilogue. This looks more like an empirical hack whose only goal is to creates a square block. Although it empirically worked, I struggled to find a first-principle explanation for this approach. So I read through my original PR #155 multiple times and think through things carefully. But the more I read and think, the more I feel that the original implementation in #155 does not make sense. The problem is, #155 tries to interleave the entire ldmatrix_rows / repeated_pattern_size with an equal size split on tile y dimension. This is overkill, because we just need to evenly distribute rows on different megabanks, and as long as we do so, the number of rows can be arbitrarily large and we can still be bank-conflict free. So we should be swizzling on a (g, g) block instead of a (potentially much larger) (ldmatrix_rows / repeated_pattern_size, ldmatrix_rows / repeated_pattern_size) block.

@zasdfgbnm
Copy link
Collaborator Author

!build

@liqiangxl
Copy link
Collaborator

I tested the change in this PR with #387, it looks good.

Copy link
Collaborator

@liqiangxl liqiangxl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zasdfgbnm zasdfgbnm merged commit 2e4c1e5 into main Jul 13, 2023
@zasdfgbnm zasdfgbnm deleted the smem-swizzle-fix branch July 13, 2023 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants