[gridengine users] preventing certain jobs from being suspended (subordinated)

Tina Friedrich tina.friedrich at it.ox.ac.uk
Thu Sep 5 11:57:51 UTC 2019


We had this problem lots, and I can't quite remember how I solved it - I 
think it might've been either a JSV or a qsub wrapper that shoves all 
GPU jobs into the superordinate queue.

Now that I'm thinking about this again - does the subordinate queue 
setting accept 'queueu@@hostgroup' syntax like everything else? Don't 
remember if I ever tried that.

Tina

On 04/09/2019 21:52, Reuti wrote:
> 
> Am 04.09.2019 um 21:58 schrieb bergman at merctech.com:
> 
>> Our SoGE (8.1.6) configuration has essentially two queues: one for "all"
>> jobs and one for "short jobs". The all.q is subordinate to the short.q,
>> and short jobs can suspend a job in the general queue. At the moment, the
>> all.q has nodes with & without GPU resources (not ideal, not permanent,
>> probably to be replaced in the future with multiple queues, but it's
>> what we have now).
>>
>> Our GPU jobs do not stop or free resources when suspended (OK, the CPU
>> portion may respond correctly to SIGSTOP, but the GPU portion keeps
>> running).
>>
>> Is there any way, with our current number of queues, to exempt jobs
>> using a GPU resource complex (-l gpu) from being suspended by short jobs?
> 
> Not that I'm aware of. Almost 10 years ago I had a similar idea:
> 
> https://arc.liv.ac.uk/trac/SGE/ticket/735
> 
> -- Reuti
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users
> 



More information about the users mailing list