You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is anyone running the Command-R model via external ollama server with this?
Whenever I set a larger context size and send something to ollama from the llama conversation integration, it seems like the model is loaded on just 1 GPU for some reason... I never actually get a response though and the model is unloaded after some time.
If I then use open webui with the same model, I can see all GPUs being used (same context size set) and the model works fine within open webui.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Is anyone running the Command-R model via external ollama server with this?
Whenever I set a larger context size and send something to ollama from the llama conversation integration, it seems like the model is loaded on just 1 GPU for some reason... I never actually get a response though and the model is unloaded after some time.
If I then use open webui with the same model, I can see all GPUs being used (same context size set) and the model works fine within open webui.
Beta Was this translation helpful? Give feedback.
All reactions