[gridengine users] Slot-wise preemption help
r3spence at gmail.com
Sun Jul 1 20:17:44 UTC 2012
On Sun, Jul 1, 2012 at 12:41 PM, Reuti <reuti at staff.uni-marburg.de> wrote:
> Am 01.07.2012 um 21:30 schrieb Ray Spence:
> > <snip>
> > >With this config will SGE kill any job in high.q that attempts to
> exceed 128G whether h_vmem is requested
> > > upon submission (qsub) or not? That is our goal..
> > Yes/no. It will limit the jobs memory consumption to the default in the
> complex definition or the requested value on the command line (and you can
> request only up to the definition in the queue, i.e. 128G).
> > What? This is confusing me more. What is the point of even configuring
> an h_vmem limit in the queue config if
> > SGE does not enforce that limit on every job run on that queue? This
> cannot be. I thought queue settings override global settings?
> No, the smaller limit will be enforced. And as the one in the complex
> definition (the default) is smaller than the one in the queue definition,
> this will be used. Nevertheless, the user can increase it by request up to
> the one in the queue definition.
What if h_vmem is set not consumable (NO) with no limit defined globally
but is limited in the queue config? Will
the queue limit be enforced? Again, I thought queue-level settings override
> Maybe it's best you define a queue and setup various limits and request
> different h_vmem values while observing the set rlimits by `ulimit -aH` in
> the job script.
What we are trying to do is accommodate our user base which is not expected
to know how, or be counted on,
to specify any job limits whatsoever. I'm beginning to think SGE cannot do
what we want it to do.
How about this: can a default queue h_vmem limit be configured in the
global sge_request file? Thus, all jobs
submitted to that particular queue have an h_vmem limit specified. Can the
sge_request file accomplish this?
Even so, must h_vmem be globally set consumable with a default value?
> > Here is what I want SGE to do: any job submitted to this particular
> queue will be killed if it attempts to allocate
> > more than 128G. How do I config SGE to do this with no other h_vmem
> limit? If in the global complex definition I set h_vmem default to 128G my
> understanding is that SGE will assume EVERY job, across all queues, might
> need 128G and so will only run 2 concurrent jobs on each of my nodes as
> they are defined to have 248G ram.
> Two times 128G won't fit into this.
Right - then SGE will run only 1 single job on each of our nodes with at
least 120G (of the 248G available to
SGE) of ram unused. Then we cannot use SGE.
> -- Reuti
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users