[gridengine users] taming qlogin
4ilya.m+grid at gmail.com
Fri Jun 23 15:38:54 UTC 2017
(Hit Send too early by mistake in previous message)
I am running 6.2u5 with ssh transport for qlogin (not tight integration)
and users are abusing this service: run jobs for days, abandon their
sessions that stay opened forever, etc. So I want to implement mandatory
time limits for all interactive jobs and, perhaps, limit the number of
interactive sessions available to any user.
I was thinking about limiting time one of the two ways: either set h_rt
via JSV (server side) or by forcing all interactive jobs to a dedicated
queue with time limit. However, there seem to be issues with both
There seems to be no way to reliably identify interactive job in JSV:
the only telling attribute is job name, i.e., QLOGIN or QRLOGIN. However
some users rename their interactive jobs, so this method will fail.
With dedicated queue approach, it is not possible to configure queues to
accept only interactive or batch jobs, because the appropriate
configuration option does not work as expected: interactive means
"immediate", so qsub can get into "interactive" queue and qlogin into
"batch" depending on use of "-now" option.
Are there any working solutions for time-limiting interactive jobs?
Another question is how to limit the number of interactive jobs per
user. Interactive jobs could potentially use more than one core/slot, so
I am not sure it is possible to limit via RQL. How to let users know
that they are limited because the already have a certain number of
interactive jobs running?
And my last question is, if I somehow succeed in setting time and number
limits for interactive jobs, how can I make sure QRSH calls for parallel
jobs are not affected?
More information about the users