Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

training error #60

Open
squirreljj opened this issue Oct 20, 2023 · 1 comment
Open

training error #60

squirreljj opened this issue Oct 20, 2023 · 1 comment

Comments

@squirreljj
Copy link

enviroment:
I build a docker from voxel-rcnn,the way is docker pull djiajun1206/pcdet-pytorch1.5
my computer is 3080ti
command
my training command is :python train.py --cfg_file cfgs/kitti_models/sfd.yaml
and batch_size = 1
error below
2023-10-20 12:14:19,784 INFO Start training kitti_models/sfd(default)
epochs: 0%| | 0/12 [00:10<?, ?it/s]
Traceback (most recent call last): | 0/3712 [00:00<?, ?it/s]
File "train.py", line 200, in
main()
File "train.py", line 155, in main
train_model(
File "/home/SFD/tools/train_utils/train_utils.py", line 86, in train_model
accumulated_iter = train_one_epoch(
File "/home/SFD/tools/train_utils/train_utils.py", line 38, in train_one_epoch
loss, tb_dict, disp_dict = model_func(model, batch)
File "/home/SFD/pcdet/models/init.py", line 30, in model_func
ret_dict, tb_dict, disp_dict = model(batch_dict)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/SFD/pcdet/models/detectors/sfd.py", line 11, in forward
batch_dict = cur_module(batch_dict)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/SFD/pcdet/models/backbones_3d/spconv_backbone.py", line 148, in forward
x = self.conv_input(input_sp_tensor)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/spconv-1.2.1-py3.8-linux-x86_64.egg/spconv/modules.py", line 134, in forward
input = module(input)
File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/spconv-1.2.1-py3.8-linux-x86_64.egg/spconv/conv.py", line 196, in forward
out_features = Fsp.indice_subm_conv(features, self.weight,
File "/usr/local/lib/python3.8/dist-packages/spconv-1.2.1-py3.8-linux-x86_64.egg/spconv/functional.py", line 87, in forward
return ops.indice_conv(features,
File "/usr/local/lib/python3.8/dist-packages/spconv-1.2.1-py3.8-linux-x86_64.egg/spconv/ops.py", line 118, in indice_conv
return torch.ops.spconv.indice_conv(features, filters, indice_pairs,
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

@squirreljj
Copy link
Author

beacause i want to compile sucess,so before i perform python setup.py develop,i perform export TORH_CUDA_ARCH_LIST="7.5",finally, I compile sucess, but show error as i told on list comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant