[gridengine users] load_thresholds, load_scaling, and hyperthreading

Reuti reuti at staff.uni-marburg.de
Wed Nov 2 21:02:18 UTC 2016


Am 02.11.2016 um 21:47 schrieb Joshua Baker-LePain:

> On Wed, 2 Nov 2016 at 11:13am, Reuti wrote
> 
>>> Am 02.11.2016 um 18:36 schrieb Joshua Baker-LePain <jlb at salilab.org>:
>>> 
>>> On our cluster, we have three queues per host, each with as many slots as the host has physical cores.  The queues are configured as follows:
>>> 
>>> o lab.q (high priority queue for cluster "owners")
>>>  - load_thresholds       np_load_avg=1.5
>>> o short.q (for jobs <30 minutes)
>>>  - load_thresholds       np_load_avg=1.25
>>> o long.q (low priority queue avaialble to all users)
>>>  - load_thresholds       np_load_avg=0.9
>>> 
>>> The theory is that we want long.q to stop accepting jobs when a node is fully loaded (read: load = physical core count) and short.q to stop accepting jobs when when a node is 50% overloaded.  This has worked well for a long while.
>> 
>> As the load is just the number of eligible processes in the run queue*, it should for sure get at least up to the number of available cores. Did you increase the number of slots for these machines too (also PEs)? What is `uptime` showing? What happens with the reported load, when you run some jobs in the background outside of SGE on these nodes?
> 
> I don't think I was entirely clear above.  We still consider a fully loaded node to be one using as many slots as there are *physical* cores. So each queue is defined to have as many slots as there are physical cores.  Our goals with the queues are this:
> 
> 1) If a node is running full load of lab.q jobs, long.q should go into
>   alarm and not accept any jobs.
> 
> 2) That same fully loaded node should accept jobs in short.q until it is
>   50% overloaded, at which time short.q should also go into alarm.
> 
> 3) Conversely, if a node is running a full load of long.q jobs, it should
>   still accept a full load of lab.q jobs.
> 
> As an example, here's a non-hyperthreaded node:
> 
> $ qhost -q -h iq116
> iq116                   linux-x64       8  9.93   15.6G    4.0G    4.0G  196.3M
>   lab.q                BP    0/8/8
>   short.q              BP    0/2/8
>   long.q               BP    0/0/8         a
> 
> lab.q is full and short.q is still accepting jobs, but long.q is in alarm, as intended.  Here's a hyperthreaded node:
> 
> $ qhost -q -h msg-id1
> HOSTNAME                ARCH         NCPU NSOC NCOR NTHR  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
> ----------------------------------------------------------------------------------------------
> global                  -               -    -    -    -     -       -       -       -       -
> msg-id1                 lx-amd64       48    2   24   48 24.52  251.6G    2.2G    4.0G     0.0
>   lab.q                BP    0/24/24
>   short.q              BP    0/0/24
>   long.q               BP    0/0/24
> 
> So even though lab.q is full, long.q isn't in alarm.  Here's how that node shows up in qconf:
> 
> $ qconf -se msg-id1
> hostname              msg-id1.ic.ucsf.edu
> load_scaling          np_load_avg=2.000000
> complex_values        mem_free=256000M
> load_values           arch=lx-amd64,num_proc=48,mem_total=257673.273438M, \
>                      swap_total=4095.996094M,virtual_total=261769.269531M, \
>                      m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT, \
>                      m_socket=2,m_core=24,m_thread=48,load_avg=24.520000, \
>                      load_short=24.490000,load_medium=24.520000, \
>                      load_long=24.500000,mem_free=255421.792969M, \
>                      swap_free=4095.996094M,virtual_free=259517.789062M, \
>                      mem_used=2251.480469M,swap_used=0.000000M, \
>                      virtual_used=2251.480469M,cpu=50.000000, \
>                      m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTT, \
>                      np_load_avg=0.510833,np_load_short=0.510208, \
>                      np_load_medium=0.510833,np_load_long=0.510417
> processors            48
> 
> Given I have both hyperthreaded and non-hyperthreaded nodes, I can't just change the value of the queue's np_load_avg load_threshold.  I thought load_scaling was the answer, but it's not having any effect that I can see.

You can define it per host or hostgroup:

$ qconf -sq serial
...
load_thresholds       tmpfree=1G,np_load_avg=1.5,[@intel2670=np_load_avg=1.75],[node22=np_load_avg=2.0]
...

-- Reuti


> 
> -- 
> Joshua Baker-LePain
> QB3 Shared Cluster Sysadmin
> UCSF





More information about the users mailing list