[gridengine users] h_vmem negative values?

Alex Chekholko chekh at stanford.edu
Wed Nov 28 21:50:44 UTC 2012


Hi all,

This is still happening to me, running latest OGS:

scg3-0-2                linux-x64      32 14.72   63.0G    5.7G    9.8G 
   26.4M
     Host Resource(s):      hc:h_vmem=12.375G
scg3-0-20               linux-x64      32 16.12   63.0G    3.5G    9.8G 
   23.5M
     Host Resource(s):      hc:h_vmem=-23.906G
scg3-0-21               linux-x64      32 13.95   63.0G    8.8G    9.8G 
   19.5M
     Host Resource(s):      hc:h_vmem=-15.906G
scg3-0-22               linux-x64      32 13.21   63.0G    5.0G    9.8G 
   24.1M
     Host Resource(s):      hc:h_vmem=-15.906G
scg3-0-23               linux-x64      32 12.81   63.0G    8.4G    9.8G 
   27.8M
     Host Resource(s):      hc:h_vmem=1.000G

Is there anything I can do to diagnose this issue?

On 10/31/12 3:19 PM, Dave Love wrote:
> Alex Chekholko <chekh at stanford.edu> writes:
>
>> Hi Reuti,
>>
>> Thanks for your response, here's the output of 'qhost -F h_vmem'.
>> I am not sure how to interpret the negative values here either.
>
> You can get over-subscription of hosts from contributions to parallel
> job resources from multiple queues, but there's also at least one bug
> producing such symptoms.  If I recall correctly, Reuti has some
> diagnosis in the issue tracker, to do with multiple resource requests.
> An RQS with a dynamic limit may work around it.
>

-- 
Alex Chekholko chekh at stanford.edu 347-401-4860



More information about the users mailing list