[gridengine users] Complex value consumption in parallel jobs

Maria Mierscheid reuti at staff.uni-marburg.de
Tue Jan 27 12:29:02 UTC 2015


Am 27.01.2015 um 13:03 schrieb Simon Andrews <simon.andrews at babraham.ac.uk>:
> 
> I've spent the morning tracking down a scheduling problem on our cluster
> which arose from a misunderstanding on how complex values and parallel
> environments interact.
> 
> In our setup we have configured h_vmem to be consumable so we can schedule
> based on the memory requirements of the jobs.  We also have a parallel
> environment set up for SMP jobs which allow the user to reserve multiple
> cores on the same physical machine.
> 
> This morning we found a load of jobs which couldn't be scheduled despite
> us appearing to have plenty of memory and cores free.  Other jobs with
> similar memory requirements and numbers of cores were able to be
> scheduled, but this one set of jobs would only stay queued.
> 
> We eventually figured out that this was because when we set a pe request
> and an h_vmem request, that the actual reservation of memory multiplies
> the h_vmem by the number of cores, so we were actually requesting about
> 10X the memory we thought we were after.  I can see that for MPI type jobs
> this makes plenty of sense since they are running independently,
> potentially on different machines.  For SMP jobs though we're actually
> just running different threads so it seems odd to have to make our users
> calculate a 'memory per core' value, rather than an overall value for the
> job.
> 
> Is there therefore any way to configure this behaviour within a pe?  I
> couldn't see anything obvious in the pe or complex config, but this must
> have been something people have addressed before.  For memory it's not so
> bad in that we can at least just divide the allocation, but for something
> like licenses where you only need one for a large SMP job I can't see how
> you could set this up.

No. It can be configured on a complex level (consumable YES vs. JOB) but then you have the opposite problem for MPI jobs. I also had this idea in the past to define the multiplication in the PE:

https://arc.liv.ac.uk/trac/SGE/ticket/197

What you can do: in a JSV you can read the actual requested name of the PE and divide the overall request of memory by the number of specified slots.

-- Reuti



More information about the users mailing list