[gridengine users] taming qlogin

Ilya M 4ilya.m+grid at gmail.com
Fri Jun 23 15:38:54 UTC 2017


(Hit Send too early by mistake in previous message)

Hello,

I am running 6.2u5 with ssh transport for qlogin (not tight integration) 
and users are abusing this service: run jobs for days, abandon their 
sessions that stay opened forever, etc. So I want to implement mandatory 
time limits for all interactive jobs and, perhaps, limit the number of 
interactive sessions available to any user.

I was thinking about limiting time one of the two ways: either set h_rt 
via JSV (server side) or by forcing all interactive jobs to a dedicated 
queue with time limit. However, there seem to be issues with both 
approaches.

There seems to be no way to reliably identify interactive job in JSV: 
the only telling attribute is job name, i.e., QLOGIN or QRLOGIN. However 
some users rename their interactive jobs, so this method will fail.

With dedicated queue approach, it is not possible to configure queues to 
accept only interactive or batch jobs, because the appropriate 
configuration option does not work as expected: interactive means 
"immediate", so qsub can get into "interactive" queue and qlogin into 
"batch" depending on use of "-now" option.

Are there any working solutions for time-limiting interactive jobs?

Another question is how to limit the number of interactive jobs per 
user. Interactive jobs could potentially use more than one core/slot, so 
I am not sure it is possible to limit via RQL. How to let users know 
that they are limited because the already have a certain number of 
interactive jobs running?

And my last question is, if I somehow succeed in setting time and number 
limits for interactive jobs, how can I make sure QRSH calls for parallel 
jobs are not affected?

Thank you,
Ilya.



More information about the users mailing list