[gridengine users] Is it possible to have load sensor for cluster-wide parameters?

Reuti reuti at staff.uni-marburg.de
Wed May 2 19:51:53 UTC 2018


> Am 02.05.2018 um 21:19 schrieb Ilya M <4ilya.m+grid at gmail.com>:
> 
> Hello,
> 
> I am trying to set slots limits in RQS as percentage of total slots available for each queue. With nodes coming up and down, queue instances getting in error state or disabled, this number is changing. So effectively I need a way to have a dynamic limit for slots.
> 
> I was thinking that it should be possible to come up with a way to get this value from a load sensor and then use this variable in RQS. (E.g. somehow use the output of 'qstat -g c -q <queue>')
> 
> Is it possible?

Yes. The load sensor can run on any exechost (only once needed) or even a special sgeexed without assigned slots on the qmaster machine.

The returned hostname must be set to "global" inside the load sensor to deliver the value as global values.

Please note the recent discussion about the problem of using load values and RQS which could prevent scheduling. But the value could be used for any alarm or suspend threshold.

-- Reuti



More information about the users mailing list