Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error of fused_dense module of flash_attn #188

Open
Jenny199099 opened this issue Sep 23, 2024 · 2 comments
Open

Error of fused_dense module of flash_attn #188

Jenny199099 opened this issue Sep 23, 2024 · 2 comments

Comments

@Jenny199099
Copy link

Hi author, I have compiled and installed fused_dense_lib successfully. But when I tried to run the finetuning code, I encountered this error: "RuntimeError: linear_act_forward failed." which is due to the line 291 in InternVideo2/single_modality/models/internvideo2.py. The complete error is as shown below:

Traceback (most recent call last):
File "", line 1, in
File "/nvme/miniconda3/envs/internvideo/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/nvme/miniconda3/envs/internvideo/lib/python3.8/site-packages/flash_attn/ops/fused_dense.py", line 457, in forward
out = fused_mlp_func(
File "/nvme/miniconda3/envs/internvideo/lib/python3.8/site-packages/flash_attn/ops/fused_dense.py", line 391, in fused_mlp_func
return FusedMLPFunc.apply(
File "/nvme/miniconda3/envs/internvideo/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 97, in decorate_fwd
return fwd(*args, **kwargs)
File "/nvme/miniconda3/envs/internvideo/lib/python3.8/site-packages/flash_attn/ops/fused_dense.py", line 257, in forward
output1, *rest = fused_dense_cuda.linear_act_forward(
RuntimeError: linear_act_forward failed.

Could you please help me to fix this? Thank you very much.

@Andy1621
Copy link
Collaborator

I have not met this problem. Maybe you can refer to this issue Dao-AILab/flash-attention#289 (comment).

@Jenny199099
Copy link
Author

I have not met this problem. Maybe you can refer to this issue Dao-AILab/flash-attention#289 (comment).

Thank you for your reply. I follow the above comment to not use fused_mlp with DeepSpeed. This can fix this error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants