[gridengine users] Strange behavior with functional scheduling

Reuti reuti at staff.uni-marburg.de
Mon Oct 9 21:45:10 UTC 2017


Am 09.10.2017 um 23:01 schrieb David Rosenstrauch:

> I'm a bit of a SGE noob, so please bear with me.  We're in the process of a first-time SGE deploy for the users in our department.  Although we've been able to use SGE, submit jobs to the queues successfully, etc., we're running into issues trying to get the fair-share scheduling - specifically the functional scheduling - to work correctly.
> 
> We have very simple functional scheduling enabled, via the following configuration settings:
> 
> enforce_user                 auto
> auto_user_fshare             100
> weight_tickets_functional         10000
> schedd_job_info                   true
> 
> (In addition, the "weight_tickets_share" setting is set to 0, thereby disabling share tree scheduling.)
> 
> A colleague and I are testing this setup by both of us submitting multiple jobs to one of our queues simultaneously, with me first submitting a large number of jobs (100) and he submitting a fewer number (25) shortly afterwards.  Our understanding is that the functional scheduling policy should prevent one user from having their jobs completely dominate a queue.  And so our expectation is that even though my jobs were submitted first, and there are more of them, the scheduler should wind up giving his jobs a higher priority so that he is not forced to wait until all of my jobs complete before his run.  (If he did have to wait, that would effectively be FIFO scheduling, not fair share.)

The display of the pending tickets has to be enabled too to see the effect (you should see them a being 0 right now in the pending list):

report_pjob_tickets               TRUE

In addition you can set the:

policy_hierarchy                  F

-- Reuti


> Although we aren't seeing FIFO scheduling, we're seeing close to it.  One of his jobs (eventually) gets assigned a high number of tickets, and a higher priority, and gets scheduled and run.  But the remaining several dozen sit in the queue and don't get run until all of mine complete - which is not really fair share.
> 
> Although it does look like functional scheduling is happening to some extent (at least one of his jobs is getting prioritized ahead of mine) this scheduling behavior is not what we were expecting to see.  Our expectation was that one of his jobs would run for every 4 of mine (more or less), and that his jobs would not wind up queued up to run after mine complete.
> 
> 
> Any idea what might be going on here?  Do I have my system misconfigured for functional scheduling?  Or am I just misunderstanding how this is supposed to work?  I've already done quite a bit of googling and man page reading on the relevant topics and settings, but wasn't able to find a good explanation for the behavior we're seeing.  Any help greatly appreciated!
> 
> Thanks,
> 
> DR
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list