You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We currently have no specialized solution whatsover for running LLM evals in the background.
After receiving the request, we simply start to iterate over the examples to annotate on the backend. We check for the running flag at each iteration of the loop and stop if the flag is set to False.
Surprisingly, this seems to work quite ok so far, probably since Flask takes care of threading.
However, it seems to be too YOLO. I also expect it not to work robustly, especially if users start launching multiple tasks at once.
At first, I also tried using Python threads manually in the code, something along the lines of:
But that actually rendered the frontend unresponsive (I might have just messed it up, though). In any case, implementing a more principled solution would be much appreciated.
The text was updated successfully, but these errors were encountered:
I would rather switch to async(io/HTTP), where we can wait for thousands of responses without affecting the server performance. We always delegate the heavy work to another server, and I think this works well for us.
If you worry about scaling the worker nodes, I would start solving it only once the waiting times for the users are too bad in some usecase. Somebody had to simulate it first. Personally, I think I will never need it in factgenie.
We currently have no specialized solution whatsover for running LLM evals in the background.
After receiving the request, we simply start to iterate over the examples to annotate on the backend. We check for the
running
flag at each iteration of the loop and stop if the flag is set toFalse
.Surprisingly, this seems to work quite ok so far, probably since Flask takes care of threading.
However, it seems to be too YOLO. I also expect it not to work robustly, especially if users start launching multiple tasks at once.
At first, I also tried using Python threads manually in the code, something along the lines of:
But that actually rendered the frontend unresponsive (I might have just messed it up, though). In any case, implementing a more principled solution would be much appreciated.
The text was updated successfully, but these errors were encountered: