[gridengine users] Exclusive host access + h_vmem
reuti at staff.uni-marburg.de
Tue Oct 30 21:38:31 UTC 2012
Am 30.10.2012 um 21:56 schrieb Julien Nicoulaud:
> Am 29.10.2012 um 17:30 schrieb Julien Nicoulaud:
> > I have a special queue for exclusive host access using a forced boolean complex + subordinate queues, as described here: https://blogs.oracle.com/templedf/entry/exclusive_host_access_with_grid.
> > Now I'm in the process of setting up forced memory reservation:
> > • Turned h_vmem into a consumable resource
> > • Set up a value on each exec host
> > It works just fine except for the case of the exclusive queue: it makes no sense getting exclusive access to a host and not being able to use all its memory. Is there a way to:
> > • Somehow automatically set requested h_vmem to granted host h_vmem
> > • Or even just exclude this queue from h_vmem checking
> > Does anyone know a good "pattern" for dealing with this case ?
> you mean: if some requests exclusive access, to adjust h_vmem accordingly?
> Yes, I want to automatically set the job h_vmem to the host max (as configured with qconf -me <host>).
> In principle a JSV (job submission verifier) could do. But for parallel jobs it might depend on the actual allocation which is used during scheduling what would be feasible. Are you also requesting e dedicated amount of cores per machine? Are you executing more then one time `qrsh -inherit` to a slave node?
> Background for this question is, that on the master node of the parallel job, the job script will get h_vmem multiplied by the granted slots on this machine (as any h_vmem request is per slot). But for each `qrsh -inherit` it will be granted only once. So it could be necessary to request the number of machines instead and for each to request the full memory.
> I do have some parallel jobs running in this queue, but no core binding, and no "qrsh -inherit".
> But anyway, before handling the case of parallel jobs, I took a dive into the JSV docs/samples, and I must say I'm quite confused on how you do that with a JSV. I can't see how one can get information about the "elected" host in the JSV, or am I missing something obvious ?
No, I was referring to an uniform cluster and just to adjust:
$ qsub -l excl foobar.sh
$ qsub -l excl,h_vmem=64G foobar.sh
in case all have 64G. The JSV is used at submission time to adjust resource requests according to some policy of the admin.
If I think about again with your heterogenous cluster: why adjust at all? You know your exclusive job will need 16GB if scheduled to a 16GB node. Now it's being scheduled to a 64GB exechost - as we know, that it is sufficient to have 16GB, there is no need to change it to 64GB.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users