You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I stupidly configured Parsl so my htex has 0 workers (htex._workers_per_node == 0). This leads to the following logs:
parsl.jobs.strategy:214 _general_strategy DEBUG: Slot ratio calculation: active_slots = 0, active_tasks = 3
parsl.jobs.strategy:217 _general_strategy DEBUG: Executor HighThroughputExecutor has 3 active tasks, 0/1 running/pending blocks, and 0 connected workers
parsl.jobs.strategy:266 _general_strategy DEBUG: Strategy case 2: slots are overloaded - (slot_ratio = active_slots/active_tasks) < parallelism
parsl.jobs.strategy:275 _general_strategy DEBUG: Strategy case 2b: active_blocks 1 < max_blocks 2 so scaling out
htex never launches any workers, so the strategy will ask for more blocks. Because tasks_per_node == htex.workers_per_node == 0, we get the following error
parsl.utils:352 make_callback ERROR: Callback threw an exception - logging and proceeding anyway
Traceback (most recent call last):
File "/kyukon/data/gent/vo/000/gvo00003/vsc43633/micromamba/envs/test_psiflow/lib/python3.10/site-packages/parsl/utils.py", line 350, in make_callback
self.callback(*self.cb_args)
File "/kyukon/data/gent/vo/000/gvo00003/vsc43633/micromamba/envs/test_psiflow/lib/python3.10/site-packages/parsl/jobs/job_status_poller.py", line 22, in poll
self._strategy.strategize(self._executors)
File "/kyukon/data/gent/vo/000/gvo00003/vsc43633/micromamba/envs/test_psiflow/lib/python3.10/site-packages/parsl/jobs/strategy.py", line 163, in _strategy_simple
self._general_strategy(executors, strategy_type='simple')
File "/kyukon/data/gent/vo/000/gvo00003/vsc43633/micromamba/envs/test_psiflow/lib/python3.10/site-packages/parsl/process_loggers.py", line 26, in wrapped
r = func(*args, **kwargs)
File "/kyukon/data/gent/vo/000/gvo00003/vsc43633/micromamba/envs/test_psiflow/lib/python3.10/site-packages/parsl/jobs/strategy.py", line 277, in _general_strategy
excess_blocks = math.ceil(float(excess_slots) / (tasks_per_node * nodes_per_block))
ZeroDivisionError: float division by zero
Nothing (useful) happens and every call to strategy fails with this same error.
I have two questions:
Is there a use case for an executor without any workers? Otherwise, you could catch and throw during initialisation.
If the scaling strategy fails repeatedly, maybe Parsl should simply give up after so many retries instead of indefinitely trying and getting stuck?
The text was updated successfully, but these errors were encountered:
Describe the bug
I stupidly configured Parsl so my
htex
has 0 workers (htex._workers_per_node == 0
). This leads to the following logs:htex
never launches any workers, so thestrategy
will ask for more blocks. Becausetasks_per_node == htex.workers_per_node == 0
, we get the following errorNothing (useful) happens and every call to
strategy
fails with this same error.I have two questions:
executor
without anyworkers
? Otherwise, you could catch and throw during initialisation.The text was updated successfully, but these errors were encountered: