[gridengine users] Strange behavior with functional scheduling
reuti at staff.uni-marburg.de
Mon Oct 9 22:23:42 UTC 2017
-----BEGIN PGP SIGNED MESSAGE-----
Am 10.10.2017 um 00:00 schrieb David Rosenstrauch:
> On 2017-10-09 5:45 pm, Reuti wrote:
>> Am 09.10.2017 um 23:01 schrieb David Rosenstrauch:
>>> I'm a bit of a SGE noob, so please bear with me. We're in the process of a first-time SGE deploy for the users in our department. Although we've been able to use SGE, submit jobs to the queues successfully, etc., we're running into issues trying to get the fair-share scheduling - specifically the functional scheduling - to work correctly.
>>> We have very simple functional scheduling enabled, via the following configuration settings:
>>> enforce_user auto
>>> auto_user_fshare 100
>>> weight_tickets_functional 10000
>>> schedd_job_info true
>>> (In addition, the "weight_tickets_share" setting is set to 0, thereby disabling share tree scheduling.)
>>> A colleague and I are testing this setup by both of us submitting multiple jobs to one of our queues simultaneously, with me first submitting a large number of jobs (100) and he submitting a fewer number (25) shortly afterwards. Our understanding is that the functional scheduling policy should prevent one user from having their jobs completely dominate a queue. And so our expectation is that even though my jobs were submitted first, and there are more of them, the scheduler should wind up giving his jobs a higher priority so that he is not forced to wait until all of my jobs complete before his run. (If he did have to wait, that would effectively be FIFO scheduling, not fair share.)
>> The display of the pending tickets has to be enabled too to see the
>> effect (you should see them a being 0 right now in the pending list):
>> report_pjob_tickets TRUE
>> In addition you can set the:
>> policy_hierarchy F
>> -- Reuti
> Thanks for the feedback.
> We do have report_pjob_tickets set to TRUE. However, our policy_hierarchy is set to OFS. Still, shouldn't that not be an issue if we have weight_tickets_share set to zero? (I.e., if we're not using override or shared tree, then shouldn't this be effectively equivalent to "policy_hierarchy F"?)
Yes, but can be streamlined.
Are you mixing parallel and serial jobs? The default is an urgency in the slots complex which leads to the effect that jobs requesting more slots are more important.
- -- Reuti
-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org
-----END PGP SIGNATURE-----
More information about the users