[gridengine users] a way to selectively run queued jobs?

Reuti reuti at staff.uni-marburg.de
Thu Nov 3 00:19:40 UTC 2016


Hi,

Am 02.11.2016 um 22:16 schrieb Michael Stauffer:

> SoGE 8.1.8
> 
> Hi,
> 
> Is there a way for the admin to selectively run a queued job when there are resources available, but the user's rqs-defined quota has been met? I'd like to say something like:
> 
>    qrun -u <user> -q <queue> --how-many-additional-jobs-to-start-running <number>
> 
> I'm looking for a way to allow some dynamic quota increases when cluster load is low. I currently have a simple system that checks for low load, and will increase everyone's quota and recreate the rqs's. But I'm moving to rqs's based on projects and user-specific limits within projects, and also multiple time-limited queues. So my method of setting and changing rqs's is getting a lot more complicated.
> 
> A complication may be that my rqs's limit based on slots and h_vmem and s_vmem, where h_vmem and s_vmem are a multiple of the number of slots a user is allowed. But if there's some may to run a job regardless of rqs limits, and just have it check the requested resources against host consumables as usual, then this wouldn't matter, I'm thinking.

I don't know how you change your RQS, but it can also be done on the command line while it's not explained in detail in the man page:

$ qconf -srqs 
{
   name         foo
   description  Demo
   enabled      FALSE
   limit        name special users {reuti} to slots=20
   limit        name common users {*} to slots=10
}

$ qconf -mattr resource_quota limit slots=10 foo/special
$ qconf -mattr resource_quota limit slots=5 foo/2
$ qconf -mattr resource_quota enabled true foo

$ qconf -srqs 
{
   name         foo
   description  Demo
   enabled      TRUE
   limit        name special users {reuti} to slots=10
   limit        name common users {*} to slots=5
}

Also several values can be changed at once:

$ qconf -mattr resource_quota limit h_vmem=1G,slots=25 foo/special

Maybe this way you can write a wrapper which increases the values for the user in question temporarily and set it back once the job started.

-- Reuti



More information about the users mailing list