[gridengine users] h_vmem negative values?

Alex Chekholko chekh at stanford.edu
Fri Oct 19 18:58:03 UTC 2012


qhost values seem fine:

...
scg3-0-11               lx26-amd64     32 27.15   63.0G   38.3G    9.8G 
  393.6M
scg3-0-12               lx26-amd64     32 27.36   63.0G   38.7G    9.8G 
   33.6M
scg3-0-13               lx26-amd64     32 22.61   63.0G   24.4G    9.8G 
   31.5M
...

When I submit a job as myself with such a memory request, it doesn't get 
dispatched, just sits in 'qw'.

Regards,
Alex

On 10/18/12 7:41 PM, Rayson Ho wrote:
> Alex,
>
> Can you run qhost and see if the memory value is also negative also??
> If it is, then this bug was fixed in any release of OGS/GE.
>
> Rayson
>
>
>
> On Thu, Oct 18, 2012 at 6:53 PM, Alex Chekholko <chekh at stanford.edu> wrote:
>> Hi,
>>
>> Running Rocks 6, so whatever GE version is included there.
>>
>> h_vmem is set consumable and per job, 4G default:
>>
>> -bash-4.1$ qconf -sc |grep h_vmem
>> h_vmem              h_vmem     MEMORY      <=    YES         JOB 4G       0
>>
>> each exec host has an h_vmem attribute set:
>> -bash-4.1$ qconf -se scg3-0-11 |grep h_vmem
>> complex_values        slots=16,h_vmem=60G
>>
>> pe "shm" is defined;
>> -bash-4.1$ qconf -sp shm
>> pe_name            shm
>> slots              999
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    NONE
>> stop_proc_args     NONE
>> allocation_rule    $pe_slots
>> control_slaves     FALSE
>> job_is_first_task  TRUE
>> urgency_slots      min
>> accounting_summary FALSE
>>
>> A user is submitting a job with '-pe shm -l h_vmem=120G', and it's getting
>> dispatched to a host that has h_vmem=60G defined.  How is that possible?
>>
>> And qstat reports negative h_vmem values, e.g.:
>> -bash-4.1$ qstat -f -u '*' -F h_vmem
>> ...
>> all.q at scg3-0-11.local          BIP   0/16/16        12.12    lx26-amd64
>>          hc:h_vmem=-80.000G
>>    88866 0.50500 mCSRR57762 yxl          r     10/18/2012 09:17:21     1
>>    89094 0.60500 G_ordermar elisaz       r     10/18/2012 15:03:39    15
>> ...
>>
>> Maybe the sgeexecd needs to be cycled for the setting to take effect?  I can
>> try that next.
>>
>> Regards,
>> --
>> Alex Chekholko chekh at stanford.edu



More information about the users mailing list