[gridengine users] preventing certain jobs from being suspended (subordinated)

bergman at merctech.com bergman at merctech.com
Wed Sep 4 19:58:54 UTC 2019


Our SoGE (8.1.6) configuration has essentially two queues: one for "all"
jobs and one for "short jobs". The all.q is subordinate to the short.q,
and short jobs can suspend a job in the general queue. At the moment, the
all.q has nodes with & without GPU resources (not ideal, not permanent,
probably to be replaced in the future with multiple queues, but it's
what we have now).

Our GPU jobs do not stop or free resources when suspended (OK, the CPU
portion may respond correctly to SIGSTOP, but the GPU portion keeps
running).

Is there any way, with our current number of queues, to exempt jobs
using a GPU resource complex (-l gpu) from being suspended by short jobs?

Thanks,

Mark


More information about the users mailing list