[gridengine users] Functional share policy question

Jesse Becker beckerjes at mail.nih.gov
Tue Nov 27 22:30:35 UTC 2012


On Tue, Nov 27, 2012 at 05:15:24PM -0500, Allan Tran wrote:
>On Tue, Nov 27, 2012 at 2:43 PM, Jesse Becker <beckerjes at mail.nih.gov<mailto:beckerjes at mail.nih.gov>> wrote:
>I was thinking to enable the functional share policy and actually set it up, following this instructions (http://docs.oracle.com/cd/E19080-01/n1.grid.eng6/817-5677/i999885/index.html)
>However I'm not quite clear how the number of functional tickets translates to SGE slots. Will job will be suspended or resumed by default with this setup? Or does it even do what I'm after here.
>Thank for your response and advice.
>
>Functional shares (alone), won't suspend any jobs.  It is used for
>scheduling jobs, to try and balanace job distribution as best it can
>according to the ticket policy you've set.
>
>With 120 total slots in a single queue, and assuming sufficient jobs from
>each "group," SGE will try to allocate 24 SLOTS to math, 60 SLOTS to chem,
>12 to CS, 12 to Bio, and 12 to "other."  Note that I said "slots" and,
>not "nodes."  Unless there's a good reason to not "mix" jobs from
>different groups on the same node, don't try to segregate things.
>
>So the number of tickets (10000) doesn't necessarily means much? I could use 100 and give out 50, 20, 10, 10 and 10 to each group accordingly and they still get above number of slots?

More tickets gives you a finer granularity for managing your policies.
While I haven't really thought about it much, I don't see the point of
having more than about 5x the number of tickets as you have queue slots.
It certainly doesn't *hurt*, but you don't gain anything either (that I
can see).

You are correct in that you could use 100 total tickets, given out as
you mentioned.  That's the same as using 10,000 tickets, and allocating
5000 to chem, 2000 to math, etc.  The important thing is the relative
number of tickets:  "50/20" is the same (so far as scheduling goes) as
"500/200" or "5/2."

However, functional shares are only part of a larger scheduling policy
that can take into account other factors, including how long a job has
been queued, what user/group submitted the job, what resources are
requested, etc.

>Yes, I have one queue and I like the idea of slots...to make life simple. Any group can get their allocated slots on wherever nodes available at any time. They might get 12 slots on 2 or 3 different nodes.
>
>
>Functional shares also won't inherently suspend any jobs; it deals with
>scheduling and dispatch.  You can suspend jobs via other means though,
>including load threshold and subordinate queues.
>
>Thanks, I will read more on this. At this point just want to get the share policy working
>
>
>Incidentally, the "share tree" works basically the same way as
>functional shares, except that it takes past usage into account.
>Functional shares *only* look at current state of the queues *right
>now*.  This may, or may not be appropriate for your circumstance.
>
>I don't think the groups would care about past usage. They just want to get their share at any given time. But from what you pointed out that jobs won't be suspended...so when a unused proportions from one group get used by another group, will the jobs from group with fewer resources than given be next in line to run?

It can be useful, even if the users don't think they want it.  Share
tree usage can be more fair over time, which might help avoid political
problems with one group being perceived as "using too much."


-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)


More information about the users mailing list