[gridengine users] Grid Engine accounting question

Jesse Becker beckerjes at mail.nih.gov
Thu May 9 01:41:38 UTC 2013


On Wed, May 08, 2013 at 02:02:04PM -0700, Brian McNally wrote:
>Thanks Reuti, you're awesome!
>
>I thought the halftime just dictated the length of time usage took 
>before it was half its original value. It seems to be that that is 
>not the same as how long the scheduler keeps usage information for 
>jobs. Although, at some point, say 4-5 halflife cycles the decayed 
>usage is very small and doesn't have much of an impact.

I seem to recall hearing that "5 halflives" is how long radioactive
stuff has to decay before it's "safe."  Don't quote me on that though.
:)

Poking around a bit in the sgeee.c file (from SoGE version 8.1.1, which
is what I have handy ATM), it looks like that, even though halftime is
specified in hours, the calculations are actually done in minutes.

It also looks like a real exponential decay is used, instead of a linear
decrease (as in some of the load calculations).  I think that the actual
decay rates come from the following (sge_support.c):

/*--------------------------------------------------------------------
  * calculate_decay_constant - calculates decay rate and constant based
  * on the decay half life and usage interval. The halftime argument
  * is in minutes.
  *--------------------------------------------------------------------*/

void
calculate_decay_constant( double halftime,
                           double *decay_rate,
                           double *decay_constant )
{
    if (halftime < 0) {
       *decay_rate = 1.0;
       *decay_constant = 0;
    } else if (halftime == 0) {
       *decay_rate = 0;
       *decay_constant = 1.0;
    } else {
       *decay_rate = - log(0.5) / (halftime * 60);
       *decay_constant = 1 - (*decay_rate * sge_usage_interval);
    }
    return;
}

This is especially interesting since it implies that negative halftimes
are acceptible.  Sure enough, setting a negative value zeros out
historical usage:
https://blogs.oracle.com/sgrell/entry/a_couple_lines_on_halftime


So yes, you'd need to keep your accounting files around for some number
of halftimes.  At 5 halflives, you're at 1/32nd of the original
weighting, or about 3%.




>--
>Brian McNally
>
>On 05/08/2013 01:46 PM, Reuti wrote:
>>Hi,
>>
>>Am 08.05.2013 um 22:30 schrieb Brian McNally:
>>
>>>qacct reports usage from a file, but GE has its own internal database for tracking jobs and usage.
>>
>>You mean for the share tree policy? Yes.
>>
>>
>>>Is this correct? If so, what controls the length of time GE keeps job data for?
>>
>>The "halftime" setting in the scheduler configuration (`man sched_conf`).
>>
>>
>>>It seems that using qacct to display overall usage per user (-o), for example, might be a little misleading if the actual accounting information is stored internally. Users might draw conclusions about their usage and how that'll impact their job priorities based on potentially incorrect data.
>>
>>Unfortunately this is correct. You can even remove the accouting file or rotate it which might lead to even different output. It would be hard to mimic the internal computation. Maybe setting "report_pjob_tickets" to true could give them a hint at which position their jobs are in the pending list (usually it's switched off for performance reasons).
>>
>>-- Reuti
>>
>>
>>>
>>>Thanks,
>>>
>>>--
>>>Brian McNally
>>>_______________________________________________
>>>users mailing list
>>>users at gridengine.org
>>>https://gridengine.org/mailman/listinfo/users
>>
>_______________________________________________
>users mailing list
>users at gridengine.org
>https://gridengine.org/mailman/listinfo/users

-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)



More information about the users mailing list