[gridengine users] Resource quota question

Reuti reuti at staff.uni-marburg.de
Wed May 4 16:58:38 UTC 2011


Hi,

Am 04.05.2011 um 18:09 schrieb Chris Jewell:

> I'm currently having a problem in GE 6.2u5 with a resource quota configuration.  Essentially, I have 5 batch queues set up (veryshort, short, medium, long, unlimited) and 1 interactive queue.  I have written the following resource quota to ensure that any one user cannot consume more than 40 slots at any one time:
> 
> {
>   name         limit_slots_batch
>   description  "Limit users to 40 slots on batch queue"
>   enabled      TRUE
>   limit        users {*} queues !interactive.q to slots=40
> }
> 
> 
> However, I am getting a problem when scheduling parallel jobs where GE seems to think that the slot quota has been exceeded, even though there are plenty of free slots in the relevant queues.  Sometimes restarting sgemaster helps, sometimes it does not.  The output from qstat -j is as below, the offending line appears to be the penultimate one.
> 
> Does anyone have an idea what might be causing this, and how I could fix it?

Is the interactive queue interactive only? I mean, when the qtype is only set to "batch" only interactive (better: immediate) jobs should end there and it could be removed from the queue request and the resource quota.

Such strange behavior I saw up to now when you request some other consumable like in your case h_vmem in combination with a resource quota limit (is h_vmem consumable?).

$ qquota -u "*"

doesn't return anything?

-- Reuti


> Thanks,
> 
> Chris
> 
> 
> ==============================================================
> job_number:                 83168
> exec_file:                  job_scripts/83168
> submission_time:            Wed May  4 15:57:22 2011
> owner:                      stsiab
> uid:                        1000
> group:                      st
> gid:                        1001
> sge_o_home:                 /home/stsiab
> sge_o_log_name:             stsiab
> sge_o_path:                 /usr/local/packages/python-2.7/bin:/usr/local/packages/R-2.11.1/bin:/usr/local/packages/eclipse-3.6:/usr/local/packages/valgrind-3.5.0/bin:/usr/local/packages/gdb-6.8/bin:/usr/local/packages/jdk1.6.0_21/bin:/usr/local/packages/openmpi-1.4.3/bin:/usr/local/packages/gcc-4.4.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/sge/bin/lx24-amd64:/usr/sge/bin/lx24-amd64
> sge_o_shell:                /bin/bash
> sge_o_workdir:              /home/stsiab/brandenburg/brandenburg/release/src/unitTests
> sge_o_host:                 buster
> account:                    sge
> cwd:                        /home/stsiab/brandenburg/brandenburg/release/src/unitTests
> hard resource_list:         h_vmem=500M,h_rt=86400
> mail_list:                  stsiab at buster.cluster.stats.local
> notify:                     FALSE
> job_name:                   auseiMcmcNC
> jobshare:                   0
> hard_queue_list:            !interactive.q
> shell_list:                 NONE:/bin/bash
> env_list:                   
> script_file:                auseiMcmc.com
> parallel environment:  mpi range: 16
> version:                    1
> scheduling info:            cannot run in queue "interactive.q" because it is not contained in its hard queue list (-q)
>                                         cannot run because it exceeds limit "stsiab/////" in rule "limit_slots_batch/1"
>                                         cannot run in PE "mpi" because it only offers 0 slots
> 
> 
> 
> --
> Dr Chris Jewell
> Department of Statistics
> University of Warwick
> Coventry
> CV4 7AL
> UK
> Tel: +44 (0)24 7615 0778
> 
> 
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list