-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLM inference on Lunar Lake 258v causes system reboot #1435
Comments
https://huggingface.co/datasets/endomorphosis/LunarLake_Crash_MemDump/resolve/main/MEMORY.DMP |
Could you please share the following information with us to further investigate the issue?
|
I have provided the code where the breakpoint should be placed. Here is the entrypoint to that code. The code occurs intermittently once every 3 or 4 runs.
|
it did not appear to have anything to do with running out of system ram, I the only difference between the vanilla implementation from the examples list, and the code that I wrote in python that causes the system crash on the ov_model.generate() function, is that I have also have cuda dependencies, because I am writing some code that auto-loads models regardless of hardware platform and model architecture, and multiplexes the inference endpoints from api providers.
https://github.com/endomorphosis/ipfs_accelerate_py/blob/a1cb9ca8e0d8623bf8ddc66daed350ff2cf27dfd/ipfs_accelerate_py/worker/skillset/hf_llava.py#L325
The text was updated successfully, but these errors were encountered: