[gridengine users] execd load sensors timing
Reuti
reuti at staff.uni-marburg.de
Mon Jul 9 13:08:44 UTC 2012
Am 09.07.2012 um 14:51 schrieb William Hay:
> On 9 July 2012 12:50, Reuti <reuti at staff.uni-marburg.de> wrote:
>> Am 09.07.2012 um 11:42 schrieb William Hay:
>>
>>> When execd starts is it safe to assume that the load sensors will be
>>> run and reported back to the qmaster/scheduler before the node is
>>> declared
>>> contactable/eligible for scheduling again?
>>>
>>> I have a load sensor that reports when the node was last booted and
>>> would like to be sure that the time used for scheduling decisions is
>>> accurate.
>>
>> No. The load sensor will only be triggered with the next interval when it's triggered in the usual cycle AFAICS when I start the execd on a particular node.
>>
>> To avoid it, you could report a BOOLEAN in the load sensor too and use this as an entry in load_thresholds in the queue definition to put the queue instance into alarm state (i.e. don't get any jobs scheduled thereto), as long as the load sensor doesn't report TRUE to reflect available.
>>
> Would there not be a similar risk there though where the boolean is
> cached from before a reboot or do load thresholds work differently?
If you reboot to fast: yes. So the old values should first vanish from the load report.
You can set "initial_state" disabled in the queue configuration, so that queue on this exechost needs to be enabled first after a reboot.
-- Reuti
More information about the users
mailing list