[gridengine users] cpu usage calculation
Marshall2, John (SSC/SPC)
john.marshall2 at canada.ca
Fri Aug 31 17:33:04 UTC 2018
I agree with your assessment of the usefulness of -binding. For that reason (and
others) we haven't bothered with it.
Although not optimal (for optimal, ask for all cores), we simply allocate ncores
from an undifferentiated pool (which may be a subset of all cores) that is
associated with a host or queue. This gives us more flexibility to manage/setup
the cores apart from gridengine but use gridengine to simply handle the
consumable (which is easy for users to think about).
On Fri, 2018-08-31 at 15:55 +0100, William Hay wrote:
On Fri, Aug 31, 2018 at 10:27:39AM +0000, Marshall2, John (SSC/SPC) wrote:
When gridengine calculates cpu usage (based on wallclock) it uses:
cpu usage = wallclock * nslots
This does not account for the number of cpus that may be used for
each slot, which is problematic.
I have written up an article at:
which explains the issue and provides a patch (against sge-8.1.9)
cpu usage = wallclock * nslots * ncpus_per_slot
This makes the usage information much more useful/accurate
when using the fair share.
Have others encountered this issue? Feedback is welcome.
Used to do something similar (our magic variable was thr short for
threads). The one thing that moved us away from that was in 8.x grid
engine binds cores to slots via -binding.
Rather than adding support for another mechanism to specify cores (slots,
-binding) it might be a better idea to support calculating cores per
slot based on -binding.
That said I'm not a huge fan of -binding. If a job has exclusive access
to a node then the job can handle its own core binding. If the job
doesn't have exclusive access then binding strategies other than linear
don't seem likely to be successful.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users