You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When rejuding a large contest or getting a lot of submission for problems with many testcases, it could be possible that some submissions are taking much longer wall time than their CPU time. With a short timelimit overshoot these submissions might be judged as TLE even if they are correct.
And this is actually what happens in a recent ICPC Asia Regional Contest (with ~350 teams and an easy problem with 50 testcases). After taking a lot time bisecting kernel and debugging, it was found out that a lock contention issue (2 global locks: shrinker_rwsem and cgroup_mutex) in kernel < 6.3 under heavy load might block kernel operations such as cgroup and page fault handling inside memory cgroup for several seconds.
Though it is impossible for judgedaemon (runguard) to "fix" this issue by code, mentioning the kernel issue in documentation could be helpful for server admins.
Your environment
DOMjudge/Webserver: any compatible version
OS: Ubuntu 22.04 with kernel 5.15 (default) or 6.2 (latest generic kernel in jammy repo)
Tested under a KVM with 32 cores and 21 or 30 judgedaemons, and a bare metal 2 CPUs (40 cores) server with 21 judgedaemons.
Steps to reproduce
Submit a correct solution many times at once like:
for i in $(seq 1 1000); ~/Downloads/domjudge-8.2.2/submit/submit --url http://localhost:12345/ --contest test -y G.cpp; end
And wait for it to be done.
Expected behaviour
Reasonable judgehost system load, and no submission takes a wall time much longer than its CPU time.
Actual behaviour
Judgehost system load >= 2 * judgedaemon number. With timelimit overshoot set to 1s|10%, some submissions are judged as TLE even they only take a very short CPU time. The judgement is very slow.
Any other information that you want to share?
#2157 mentions about "the call cgroup_delete_cgroup_ext did sometimes hang for multiple seconds". I'm afraid that a double check for this contest rejudgement might be necessary to ensure no correct solutions are judged as TLE...
If you are interested in this specific kernel issue, I have also written a blog post (Simp. Chinese) to help explain this to contestants affected in this regional contest, and for server admins in later contests.
The text was updated successfully, but these errors were encountered:
Thanks a lot for this big write up. We normally advice to not run many judgehosts on one machine (since there will always be some overhead) but it might indeed be worth it to mention this explicitly.
Since you mentioned that disable CLONE_NEWIPC would fix this issue, how about using seccomp to restrict IPC related syscalls rather than create IPC namespace?
Since you mentioned that disable CLONE_NEWIPC would fix this issue, how about using seccomp to restrict IPC related syscalls rather than create IPC namespace?
Theoretically yes, but it would be a bit difficult to list all IPC-related syscalls, and the potential side effects of using seccomp are unknown.
Description of the problem
When rejuding a large contest or getting a lot of submission for problems with many testcases, it could be possible that some submissions are taking much longer wall time than their CPU time. With a short timelimit overshoot these submissions might be judged as TLE even if they are correct.
And this is actually what happens in a recent ICPC Asia Regional Contest (with ~350 teams and an easy problem with 50 testcases). After taking a lot time bisecting kernel and debugging, it was found out that a lock contention issue (2 global locks:
shrinker_rwsem
andcgroup_mutex
) in kernel < 6.3 under heavy load might block kernel operations such as cgroup and page fault handling inside memory cgroup for several seconds.(This is fixed (or alleviated) after kernel commit torvalds/linux@da27f79)
Though it is impossible for judgedaemon (runguard) to "fix" this issue by code, mentioning the kernel issue in documentation could be helpful for server admins.
Your environment
Steps to reproduce
Submit a correct solution many times at once like:
And wait for it to be done.
Expected behaviour
Reasonable judgehost system load, and no submission takes a wall time much longer than its CPU time.
Actual behaviour
Judgehost system load >= 2 * judgedaemon number. With timelimit overshoot set to
1s|10%
, some submissions are judged as TLE even they only take a very short CPU time. The judgement is very slow.Any other information that you want to share?
#2157 mentions about "the call
cgroup_delete_cgroup_ext
did sometimes hang for multiple seconds". I'm afraid that a double check for this contest rejudgement might be necessary to ensure no correct solutions are judged as TLE...If you are interested in this specific kernel issue, I have also written a blog post (Simp. Chinese) to help explain this to contestants affected in this regional contest, and for server admins in later contests.
The text was updated successfully, but these errors were encountered: