[gridengine users] Subordinate Queues

Reuti reuti at staff.uni-marburg.de
Fri Jun 15 16:06:43 UTC 2012


Am 15.06.2012 um 17:18 schrieb Joseph Farran:

> Greetings.
> 
> I am playing with OGE subordinate Queues and I can't seem to get it right.
> 
> All my nodes are 64 cores and I set all my nodes to node pack jobs with:
> 
>    qconf -rattr exechost complex_values "slots=64" node1   ( repeat for all other nodes )

This is the number of used slots across all queues. So, if one queue instance on an exechost uses them all up, a job in the superordinated queue will never start. In this case the slot count needs to be set in the queue definition (which could by arbitrary if the exechost setting is limiting it to 64, but in your case 64 in the queue definition should do).

Subordination will not free any resources like memory or alike. The job is still on the node, just stopped and hanging around.

-- Reuti


> The scheduler is then set with Load Formula to "slots".   So up to 64 serial jobs get packed unto one node and after that, the 65th+ goes to the next node until it reaches 64 cores and so on.   So each node will only run a max of 64 slots.
> 
> I have queue Q1 which points to node1 & node2, and queue Q2 which points to nodes3 & node4.
> 
>    Q1 --> ( node1 node2 )
>    Q2 --> ( node3 node4 )
> 
> Now I like to have a subordinate Queue called SUB-Q which is subordinate to Q1 & Q2.   So what I am trying to make it do is as follows:
> 
> If jobs ( serial or parallel ) are submitted to SUB-Q, it will use nodes from Q1 or Q2 ( node1 through node4 ).   If jobs are submitted to Q1 or Q2, it will suspend any jobs that were submitted from SUB-Q.
> 
> Here is a scenario:
> 
> 1 64-core parallel job is submitted to SUB-Q and the scheduler picks node1.
> 1 64-core parallel job is submitted to SUB-Q and the scheduler picks node3.
> 64 single-core jobs are submitted to SUB -Q and the scheduler picks node2.
> 64 single-core jobs are submitted to SUB -Q and the scheduler picks node4.
> 
> So now all nodes are full with the following jobs:
> 
>    (job 1) Node1 running 64-cores mpi job.
>    (job 2) Node3 running 64-cores mpi job.
>    (job 3) Node2 running 64 serial jobs.
>    (job 4) Node4 running 64 serial jobs.
> 
> If a new 128 core mpi job jobs is submitted to Q1, have the scheduler suspend job #1 & #3 (node1 & node2) and then run the new 128 core job.
> 
> If 64 new serial jobs are submitted to Q2, have the scheduler suspend either jobs #2 or #4.
> 
> I have a few other questions, but this is a good start.
> 
> If possible please provide the OGE command lines to set this up as I am still a newbie with OGE.
> 
> Thanks,
> Joseph
> 
> 
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users




More information about the users mailing list