torch 可以检测到显卡信息，但是运行时提示未检测到CUDA环境及相关错误 #6424

szlf619 · 2024-12-23T10:26:37Z

Reminder

I have read the README and searched the existing issues.

System Info

windows 10
Nvidia GTX 4090 Laptop
llama-factory: 0.9.2.dev0

CUDA版本12.2
torch 2.5.1+cu124
python 3.12
Transformers version: 4.46.1
bitsandbytes 0.44.1.dev0+9315692
Datasets version: 3.1.0
Accelerate version: 1.0.1

Reproduction

torch 可以检测到显卡信息，但是运行时提示未检测到CUDA环境。

python -m bitsandbytes 报以下错误。

训练使用了示例数据中 identity，但是训练时间很久，在以下最后一段代码 Number of trainable parameters = 20,971,520 后停留很久，请问正常吗？

[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.checkpointing:157 >> Gradient checkpointing enabled.
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Upcasting trainable params to float32.
[INFO|2024-12-23 17:41:27] llamafactory.model.adapter:157 >> Fine-tuning method: LoRA
[INFO|2024-12-23 17:41:27] llamafactory.model.model_utils.misc:157 >> Found linear modules: k_proj,up_proj,v_proj,o_proj,q_proj,down_proj,gate_proj
C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\bitsandbytes\backends\cpu_xpu_common.py:29: UserWarning: g++ not found, torch.compile disabled for CPU/XPU.
warnings.warn("g++ not found, torch.compile disabled for CPU/XPU.")
[INFO|2024-12-23 17:41:27] llamafactory.model.loader:157 >> trainable params: 20,971,520 || all params: 8,051,232,768 || trainable%: 0.2605
[INFO|trainer.py:698] 2024-12-23 17:41:27,324 >> Using cpu_amp half precision backend
[INFO|trainer.py:2313] 2024-12-23 17:41:27,488 >> ***** Running training *****
[INFO|trainer.py:2314] 2024-12-23 17:41:27,488 >> Num examples = 91
[INFO|trainer.py:2315] 2024-12-23 17:41:27,488 >> Num Epochs = 3
[INFO|trainer.py:2316] 2024-12-23 17:41:27,488 >> Instantaneous batch size per device = 2
[INFO|trainer.py:2319] 2024-12-23 17:41:27,488 >> Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|trainer.py:2320] 2024-12-23 17:41:27,488 >> Gradient Accumulation steps = 8
[INFO|trainer.py:2321] 2024-12-23 17:41:27,489 >> Total optimization steps = 15
[INFO|trainer.py:2322] 2024-12-23 17:41:27,491 >> Number of trainable parameters = 20,971,520
0%| | 0/15 [00:00<?, ?it/s]C:\Users\PowerPC\AppData\Roaming\Python\Python311\site-packages\transformers\trainer.py:3536: FutureWarning: torch.cpu.amp.autocast(args...) is deprecated. Please use torch.amp.autocast('cpu', args...) instead.
ctx_manager = torch.cpu.amp.autocast(cache_enabled=cache_enabled, dtype=self.amp_dtype)

Expected behavior

想知道问题在哪里，尝试了很多方案都未曾解决。

Others

No response

The text was updated successfully, but these errors were encountered:

github-actions bot added the pending This problem is yet to be addressed label Dec 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

torch 可以检测到显卡信息，但是运行时提示未检测到CUDA环境及相关错误 #6424

torch 可以检测到显卡信息，但是运行时提示未检测到CUDA环境及相关错误 #6424

szlf619 commented Dec 23, 2024

torch 可以检测到显卡信息，但是运行时提示未检测到CUDA环境及相关错误 #6424

torch 可以检测到显卡信息，但是运行时提示未检测到CUDA环境及相关错误 #6424

Comments

szlf619 commented Dec 23, 2024

Reminder

System Info

Reproduction

Expected behavior

Others