[gridengine users] Strange behavior with functional scheduling

Reuti reuti at staff.uni-marburg.de
Mon Oct 9 22:23:42 UTC 2017


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Am 10.10.2017 um 00:00 schrieb David Rosenstrauch:

> On 2017-10-09 5:45 pm, Reuti wrote:
>> Am 09.10.2017 um 23:01 schrieb David Rosenstrauch:
>>> I'm a bit of a SGE noob, so please bear with me.  We're in the process of a first-time SGE deploy for the users in our department.  Although we've been able to use SGE, submit jobs to the queues successfully, etc., we're running into issues trying to get the fair-share scheduling - specifically the functional scheduling - to work correctly.
>>> We have very simple functional scheduling enabled, via the following configuration settings:
>>> enforce_user                 auto
>>> auto_user_fshare             100
>>> weight_tickets_functional         10000
>>> schedd_job_info                   true
>>> (In addition, the "weight_tickets_share" setting is set to 0, thereby disabling share tree scheduling.)
>>> A colleague and I are testing this setup by both of us submitting multiple jobs to one of our queues simultaneously, with me first submitting a large number of jobs (100) and he submitting a fewer number (25) shortly afterwards.  Our understanding is that the functional scheduling policy should prevent one user from having their jobs completely dominate a queue.  And so our expectation is that even though my jobs were submitted first, and there are more of them, the scheduler should wind up giving his jobs a higher priority so that he is not forced to wait until all of my jobs complete before his run.  (If he did have to wait, that would effectively be FIFO scheduling, not fair share.)
>> The display of the pending tickets has to be enabled too to see the
>> effect (you should see them a being 0 right now in the pending list):
>> report_pjob_tickets               TRUE
>> In addition you can set the:
>> policy_hierarchy                  F
>> -- Reuti
> 
> 
> Thanks for the feedback.
> 
> We do have report_pjob_tickets set to TRUE.  However, our policy_hierarchy is set to OFS.  Still, shouldn't that not be an issue if we have weight_tickets_share set to zero?  (I.e., if we're not using override or shared tree, then shouldn't this be effectively equivalent to "policy_hierarchy F"?)

Yes, but can be streamlined.

Are you mixing parallel and serial jobs? The default is an urgency in the slots complex which leads to the effect that jobs requesting more slots are more important.

- -- Reuti


-----BEGIN PGP SIGNATURE-----
Comment: GPGTools - https://gpgtools.org

iEYEARECAAYFAlnb9u4ACgkQo/GbGkBRnRrJSQCeL+emYR7iZNQJYyTmcW55LmIk
q8oAnA9RwBsJH8iUOl58oAt6F0QnRf22
=r0Fl
-----END PGP SIGNATURE-----




More information about the users mailing list