[gridengine users] Filling up nodes when using gepetools
w.hay at ucl.ac.uk
Thu Jul 30 09:30:42 UTC 2015
On Thu, 30 Jul 2015 08:25:05 +0000
"Winkler, Ursula (ursula.winkler at uni-graz.at)"
<ursula.winkler at uni-graz.at> wrote:
> > $pe_slots restricts you to a single node so I'm guessing the jobs
> > that don't start are jobs that need more than one node.
> Yes, that should be right.
> > While we don't use gepetools we do have a JSV that rewrites
> > people's requested PE based on the number What you need I think is
> > something that routes jobs that request 1 node to PEs with a
> > $pe_slots allocation rule while other jobs are routed to nodes with
> > an allocation rule equal to the requested ppn. In all cases the
> > number of slots to request should be nodes*ppn.
> The problem is that single-node jobs which require just smaller parts
> of the available slots (e.g. 2) are always started on completely free
> nodes instead of starting on already busy nodes which have the
> requested resources too. Jobs which order more than 1 node but also
> few cores/node are typically requiring more than the default of other
> resources (e.g. memory) so they are not the problem. And apart from
> the 1-node-few-slots trouble the scheduling works pretty satisfying.
My comment was meant to be read in the context of:
i)Reuti's earlier suggestion that jobs in PEs with $pe_slots
allocation_rule will be scheduled according to the queue_sort_method and
load_formula like serial jobs
ii)Your response that settting the allocation rule to $pe_slots stops
(multi-node) jobs from scheduling.
The problem is therefore how to take advantage of (i) without
My suggestion was to modify your jsv/gepetools to force single node
parallel jobs into PEs with $pe_slots allocation rules (which gives
you control over where they are scheduled via queue_sort_method and
load_formula) while sending the others to PEs with other (appropriate)
allocation rules that won't cause (ii).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 819 bytes
Desc: OpenPGP digital signature
More information about the users