[gridengine users] Filling up nodes when using gepetools

William Hay w.hay at ucl.ac.uk
Thu Jul 30 09:30:42 UTC 2015


On Thu, 30 Jul 2015 08:25:05 +0000
"Winkler, Ursula (ursula.winkler at uni-graz.at)"
<ursula.winkler at uni-graz.at> wrote:

> > $pe_slots restricts you to a single node so I'm guessing the jobs
> > that don't start are jobs that need more than one node.  
> 
> Yes, that should be right. 
> 
> > While we don't use gepetools we do have a JSV that rewrites
> > people's requested PE based on the number What you need I think is
> > something that routes jobs that request 1 node to PEs with a
> > $pe_slots allocation rule while other jobs are routed to nodes with
> > an allocation rule equal to the requested ppn.  In all cases the
> > number of slots to request should be nodes*ppn.
> 
> The problem is that single-node jobs which require just smaller parts
> of the available slots (e.g. 2) are always started on completely free
> nodes instead of starting on already busy nodes which have the
> requested resources too. Jobs which order more than 1 node but also
> few cores/node are typically requiring more than the default of other
> resources (e.g. memory) so they are not the problem. And apart from
> the 1-node-few-slots trouble the scheduling works pretty satisfying.
> 
My comment was meant to be read in the context of:
i)Reuti's earlier suggestion that jobs in  PEs with $pe_slots
allocation_rule will be scheduled according to the queue_sort_method and
load_formula like serial jobs 
ii)Your response that settting the allocation rule to $pe_slots stops
(multi-node) jobs from scheduling.
  

The problem is therefore how to take advantage of (i) without
causing (ii)

My suggestion was to modify your jsv/gepetools to force single node
parallel jobs into PEs with $pe_slots allocation rules (which gives
you control over where they are scheduled via queue_sort_method and
load_formula) while sending the others to PEs with other (appropriate) 
allocation rules that won't cause (ii).

William
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://gridengine.org/pipermail/users/attachments/20150730/69d7a423/attachment.sig>


More information about the users mailing list