[gridengine users] SGE PE scheduler problem, doesn't pick least used nodes ?
reuti at staff.uni-marburg.de
Wed Mar 16 11:01:10 UTC 2011
Am 16.03.2011 um 11:32 schrieb Alex Phillips:
> Dear List,
> We have a cluster of 1920 cores spread over 160 nodes (12 cores/node), we only run one code in one queue, with jobs of between 48 and 256 cores using an mpi pe.
> When benchmarking our code we found a 14-15% speedup by running on 6 cores/node, compared with 12 cores/node.
yes, this can been seen with certain applications.
> We also found that if we ran on 6 cores/node, with a second job on the other 6cores/node, we still have a 5-6% speedup.
> So I have configured our mpi pe with allocation_rule = 6, and this works, however, as the cluster fills up, the scheduler is starting a second job on some nodes, before all the nodes are busy.
Well, there are two problems: first there is no rule in SGE to prevent a second job on a node with this allocation_rule in case there are 12 slots. You could configure two queues instead with only 6 slots. The second queue could get a load_sensor (type boolean) as load_thresholds, which will enable the queue only if the first queue has no slots left (a global load_sensor, which will just count the free slots in the primary queue). This is not really safe, as over time something is running in the second queue and might misslead the free slot count of the primary queue, but it's worth to test it I think.
The second issue which would solve it is a missing seq_no for PEs. Then you could setup two PEs for two queues like above, and the second will only be taken if nothing left in the first queue due to the seq_no. It's an RFE though.
> How can we configure the scheduler to run one job on all the nodes, before starting a second job ?
> I have tried defining the number of slots as a complex value on the execution hosts, I’ve tried –np_load_avg, np_load_avg, slots, and -slots as the load_formula, but I can’t get it to work.
> I’ve read _http://blogs.sun.com/sgrell/entry/grid_engine_scheduler_hacks_least_ but I can’t set the allocation rule to $pe_slots, as we only want to run on 6 cores/node, not 12.
> Any suggestions ?
> *Alex Phillips*
> users mailing list
> users at gridengine.org
More information about the users