[gridengine users] Is it possible to have load sensor for cluster-wide parameters?

Ilya M 4ilya.m+grid at gmail.com
Wed May 2 21:42:12 UTC 2018


Thank you, Reuti.

On Wed, May 2, 2018 at 12:51 PM, Reuti <reuti at staff.uni-marburg.de> wrote:

>
> > Am 02.05.2018 um 21:19 schrieb Ilya M <4ilya.m+grid at gmail.com>:
> >
> > Hello,
> >
> > I am trying to set slots limits in RQS as percentage of total slots
> available for each queue. With nodes coming up and down, queue instances
> getting in error state or disabled, this number is changing. So effectively
> I need a way to have a dynamic limit for slots.
> >
> > I was thinking that it should be possible to come up with a way to get
> this value from a load sensor and then use this variable in RQS. (E.g.
> somehow use the output of 'qstat -g c -q <queue>')
> >
> > Is it possible?
>
> Yes. The load sensor can run on any exechost (only once needed) or even a
> special sgeexed without assigned slots on the qmaster machine.
>
> The returned hostname must be set to "global" inside the load sensor to
> deliver the value as global values.
>
> Please note the recent discussion about the problem of using load values
> and RQS which could prevent scheduling. But the value could be used for any
> alarm or suspend threshold.
>
> -- Reuti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gridengine.org/pipermail/users/attachments/20180502/692b0214/attachment.html>


More information about the users mailing list