[gridengine users] gid_range values
w.hay at ucl.ac.uk
Wed Jan 24 08:41:20 UTC 2018
On Tue, Jan 23, 2018 at 06:22:28PM -0600, Calvin Dodge wrote:
> The docs we've found say that gid_range must be greater than the
> number of jobs expected to run currently on one host.
> Our recent experience suggests that it has to be greater than the
> total number of jobs in the queue. If it's not, then a few jobs get
> mysteriously killed (typically about 1 in 30-40).
> Has anyone else had that experience? We did fix this by expanding the
> range (it was the default of 20000-20100, which we changed to
> 20200-21000), but would like to know if there's a "best practice"
> regarding the range of values.
Queued jobs shouldn't make a difference. It is possible that there might
be some sort of race where the gid is held onto by grid engine while it runs the
epilog(not sure haven't checked exactly when the group is deallocated). Having the
range be twice the number of jobs should cover this unless your epilog is
getting stuck for some reason.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: not available
More information about the users