[gridengine users] load_thresholds, load_scaling, and hyperthreading

Joshua Baker-LePain jlb at salilab.org
Wed Nov 2 20:47:58 UTC 2016


On Wed, 2 Nov 2016 at 11:13am, Reuti wrote

>> Am 02.11.2016 um 18:36 schrieb Joshua Baker-LePain <jlb at salilab.org>:
>>
>> On our cluster, we have three queues per host, each with as many slots 
>> as the host has physical cores.  The queues are configured as follows:
>>
>> o lab.q (high priority queue for cluster "owners")
>>   - load_thresholds       np_load_avg=1.5
>> o short.q (for jobs <30 minutes)
>>   - load_thresholds       np_load_avg=1.25
>> o long.q (low priority queue avaialble to all users)
>>   - load_thresholds       np_load_avg=0.9
>>
>> The theory is that we want long.q to stop accepting jobs when a node is 
>> fully loaded (read: load = physical core count) and short.q to stop 
>> accepting jobs when when a node is 50% overloaded.  This has worked 
>> well for a long while.
>
> As the load is just the number of eligible processes in the run queue*, 
> it should for sure get at least up to the number of available cores. Did 
> you increase the number of slots for these machines too (also PEs)? What 
> is `uptime` showing? What happens with the reported load, when you run 
> some jobs in the background outside of SGE on these nodes?

I don't think I was entirely clear above.  We still consider a fully 
loaded node to be one using as many slots as there are *physical* cores. 
So each queue is defined to have as many slots as there are physical 
cores.  Our goals with the queues are this:

1) If a node is running full load of lab.q jobs, long.q should go into
    alarm and not accept any jobs.

2) That same fully loaded node should accept jobs in short.q until it is
    50% overloaded, at which time short.q should also go into alarm.

3) Conversely, if a node is running a full load of long.q jobs, it should
    still accept a full load of lab.q jobs.

As an example, here's a non-hyperthreaded node:

$ qhost -q -h iq116
iq116                   linux-x64       8  9.93   15.6G    4.0G    4.0G  196.3M
    lab.q                BP    0/8/8
    short.q              BP    0/2/8
    long.q               BP    0/0/8         a

lab.q is full and short.q is still accepting jobs, but long.q is in alarm, 
as intended.  Here's a hyperthreaded node:

$ qhost -q -h msg-id1
HOSTNAME                ARCH         NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
----------------------------------------------------------------------------------------------
global                  -               -    -    -    -     -       -       -       -       -
msg-id1                 lx-amd64       48    2   24   48 24.52  251.6G    2.2G    4.0G     0.0
    lab.q                BP    0/24/24
    short.q              BP    0/0/24
    long.q               BP    0/0/24

So even though lab.q is full, long.q isn't in alarm.  Here's how that node 
shows up in qconf:

$ qconf -se msg-id1
hostname              msg-id1.ic.ucsf.edu
load_scaling          np_load_avg=2.000000
complex_values        mem_free=256000M
load_values           arch=lx-amd64,num_proc=48,mem_total=257673.273438M, \
                       swap_total=4095.996094M,virtual_total=261769.269531M, \
                       m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT, \
                       m_socket=2,m_core=24,m_thread=48,load_avg=24.520000, \
                       load_short=24.490000,load_medium=24.520000, \
                       load_long=24.500000,mem_free=255421.792969M, \
                       swap_free=4095.996094M,virtual_free=259517.789062M, \
                       mem_used=2251.480469M,swap_used=0.000000M, \
                       virtual_used=2251.480469M,cpu=50.000000, \
                       m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT, \
                       np_load_avg=0.510833,np_load_short=0.510208, \
                       np_load_medium=0.510833,np_load_long=0.510417
processors            48

Given I have both hyperthreaded and non-hyperthreaded nodes, I can't just 
change the value of the queue's np_load_avg load_threshold.  I thought 
load_scaling was the answer, but it's not having any effect that I can 
see.

-- 
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF



More information about the users mailing list