[gridengine users] RoundRobin scheduling among users

William Hay w.hay at ucl.ac.uk
Wed Jan 27 11:26:34 UTC 2016


On Tue, Jan 26, 2016 at 12:30:50PM -0600, Dan Hyatt wrote:
>   I am looking to use this differently.
> The problem I am having is that I have users with 200-1000 jobs. I have 80
> servers with almost 1000 cores.
> For my normal queue,  I want SGE PE to create up to 4 jobs per server until
> it runs out of servers, then add up to 4 more until all the jobs are
> allocated.  (1 per is fine as long as it will round robin and start adding a
> second job per server, then a third until it runs out of jobs)
> 
> Does the allocation rule limit the number of jobs per server PER qsub, or
> total jobs allowed per server?
It governs how the pe tasks of a single job (or array task) are distributed.
If you want the tasks of a single job(or array task) to be the only tasks 
on a node then in addition you need to request an exclusive resource associated
with the node.

> 
> The problem I am having is that I get 20 jobs per server and overload a
> couple of servers while 80 servers running idle. Each has 10 cores and 128
> GB of RAM so they can handle up to 20 light jobs each.
> 
> Also, for the heavy CPU jobs, I want a max of 4 jobs per server, so for
> pe_slots would I just put the integer 4 in there?
If you mean you only want 4 tasks of a single job on there then
a pe_slots value of 4 combined with requesting an exclusive resource 
associated with the host should do it.

If on the other hand you intend to limit to 4 independent jobs then you
could do this in a number of ways.

William
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://gridengine.org/pipermail/users/attachments/20160127/ea5a50be/attachment.sig>


More information about the users mailing list