[gridengine users] Formula to calculate job's priority

Sangamesh Banappa sangamesh.banappa at locuz.com
Wed May 8 09:32:58 UTC 2013


Hi,

    I want to know how the shares are getting reduced during subsequent job submission. In the last part of this mail, the output is pasted:
----- Original Message -----
> Hi,
> 
> Am 02.05.2013 um 11:28 schrieb Sangamesh Banappa:
> 
> > ----- Original Message -----
> >> Hi,
> >> 
> >> Am 01.05.2013 um 16:51 schrieb Sangamesh Banappa:
> >> 
> >>>        The cluster is configured with share based policy (user
> >>>        based) with equal shares (100 shares) for all.
> >>> 
> >>>        As per the below link:
> >>> 
> >>>         http://arc.liv.ac.uk/SGE/htmlman/htmlman5/sge_priority.html
> >>> 
> >>>        The job's priority is calculated as follows:
> >>> 
> >>>          prio    = weight_priority * npprio +
> >>>                    weight_urgency  * nurg +
> >>>                    weight_ticket   * ntckts
> >>> 
> >>>      Let's take an example,  a parallel job of 20 cores to
> >>>      analyze
> >>>      this. There are no other requests like #$ -l  in the
> >>>      script..
> >>> 
> >>>      The value of npprio would be zero. Because there is no #$ -p
> >>>      <posix priority value> mentioned in job script & admin also
> >>>      does not set priority for a job  manually.
> >>> 
> >>>       Further,
> >>> 
> >>>       nurg =  normalized urgency
> >>> 
> >>>       urg = rrcontr + wtcontr + dlcontr
> >>> 
> >>>       There is no user configured under deadline users group. So
> >>>       dlcontr should be zero.  ??
> >>> 
> >>>       rrcontr = sum of all (hrr)
> >>> 
> >>>       hrr = rurg  * assumed_slot_allocation * request
> >>> 
> >>>       rurg -> taken from qconf -sc | grep slots. 1000 is the
> >>>       value
> >>>       under urgency column.
> >>> 
> >>>       assumed_slot_allocation = 20 (taken from #$ -pe orte 20)
> >> 
> >> This is of course an implied resource request for slots.
> >> 
> >> 
> >>>       request = NONE, hence it is 0 (There is no other resource
> >>>       request #$ -l  )
> >>> 
> >>>       Next is "wtcontr". How to get the value for waiting_time
> >>>       contribution? Is it calculated on this job only or is it a
> >>>       sum of waiting_time of all previous jobs by the same user?
> >>> 
> >>>       I'm stuck here. Please guide me to do further calculation.
> >> 
> >> One way to check this behavior could be to set
> >> "weight_waiting_time
> >> 1.00000" together with "report_pjob_tickets TRUE" in the scheduler
> >> configuration and have a look at `qstat -urg`. Also worth to note
> >> are `qstat -pri` and `qstat -ext` for the overall computation.
> >> 
> > Both of the above mentioned settings are set.
> > 
> > This is the output. (1 node, 4 cores. The same job of 4 cores is
> > submitted multiple times)
> > 
> > 
> > # qstat -u "*" -urg
> > job-ID  prior   nurg    urg      rrcontr  wtcontr  dlcontr  name
> >       user         state submit/start at      deadline
> >           queue                          slots ja-task-ID
> > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >     47 0.61000 0.66667     4098     4000       98        0 mpi
> >            user1        r     05/02/2013 13:12:36
> >                         all.q at TEST.example.com           4
> >     48 0.60667 1.00000     4113     4000      113        0 mpi
> >            user1        qw    05/02/2013 13:10:43
> >                                                            4
> >     49 0.55500 0.50000     4112     4000      112        0 mpi
> >            user1        qw    05/02/2013 13:10:44
> >                                                            4
> 
> This is of course strange, as I don't see the entry for the nurg with
> 0.00000. In my `qstat` I can spot:
> 
>    8393 1.50167 1.00000     4063     4000       63        0 test.sh
>       reuti        qw    05/02/2013 12:53:58
>                                                           4
>    8409 1.00026 0.50000     4062     4000       62        0 test.sh
>       reuti        qw    05/02/2013 12:53:59
>                                                           4
>    8422 0.50016 0.00000     4061     4000       61        0 test.sh
>       reuti        qw    05/02/2013 12:54:00
>                                                           4
> 
> like it should be. The lowest overall urgency is normalized to zero,
> and the highest to one.
> 
> -- Reuti
> 
> 
> >     Here I did understand the values of "urg". But not getting how
> >     "nurg" is calculated.
> > 
> > # qstat -u "*" -ext
> > job-ID  prior   ntckts  name       user         project
> >          department state cpu        mem     io      tckts ovrts
> > otckt ftckt stckt share queue                          slots
> > ja-task-ID
> > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >     47 0.61000 1.00000 mpi        user1        NA
> >                   defaultdep r     0:00:00:00 0.00371 0.00078
> >       100     0     0     0   100 1.00  all.q at TEST.example.com
> >               4
> >     48 0.60500 0.50000 mpi        user1        NA
> >                   defaultdep qw
> >                                      50     0     0     0    50
> >     0.34                                     4
> >     49 0.55333 0.33333 mpi        user1        NA
> >                   defaultdep qw
> >                                      33     0     0     0    33
> >     0.23                                     4
> >     50 0.55250 0.25000 mpi        user1        NA
> >                   defaultdep qw
> >                                      25     0     0     0    25
> >     0.17                                     4
> >     51 0.55200 0.20000 mpi        user1        NA
> >                   defaultdep qw
> >                                      20     0     0     0    20
> >     0.14                                     4
> >     52 0.50167 0.16667 mpi        user1        NA
> >                   defaultdep qw
> >                                      16     0     0     0    16
> >     0.11                                     4
> > 
> > In this output why the value of tckts is getting changed? 100, 50,
> > 33. sharetree has 100 shares.

Can some one explain this: why share values are getting reduced?

The halftime is set to 168 which is 7days. How the past usage is considered here?

> >> -- Reuti
> >> 
> >> 
> >>> 
> >>> 
> >>> Thanks in advance
> >>> _______________________________________________
> >>> users mailing list
> >>> users at gridengine.org
> >>> https://gridengine.org/mailman/listinfo/users
> >> 
> >> 
> 
> 



More information about the users mailing list