[gridengine users] job cannot run in parallel environment "smp" because it only offers 2 slots

Bob Tupper bobctupper at gmail.com
Tue Feb 21 19:28:15 UTC 2012


You could change the
consumable to from YES to JOB




On 02/21/2012 11:20 AM, Txema Heredia Genestar wrote:
> Hello all,
>
> I am having some problems to run threaded jobs in SGE 6.1u4. In our 
> cluster, h_vmem is defined as a consumable attribute in all nodes. It 
> is mandatory, all jobs must request it, with a default value of 6Gb. 
> That constraint leads any "parallel" job sent to the cluster to try to 
> reserve a lot of memory (h_vmem * slots). This is ok for most parallel 
> processes (mpi and the such). But, sometimes, we need to run 
> "threaded" jobs, where all jobs share a chunk of memory (everything on 
> a single node). This leads to situations where I need to send an 
> 8-threaded job that requires, say, 10 Gb of memory, but it cannot be 
> scheduled because no node can handle a 80Gb request. When a memory 
> request cannot be fulfilled, the typical message of "cannot run in PE 
> "smp" because it only offers N slots" appears in qstat (where N is the 
> maximum number of slots I wolud be able to use given the requested 
> h_vmem size).
>
> This is the parallel environment I am trying to use:
>
> # qconf -sp smp
> pe_name           smp
> slots             9999
> user_lists        test_users
> xuser_lists       NONE
> start_proc_args   /bin/true
> stop_proc_args    /bin/true
> allocation_rule   $fill_up
> control_slaves    FALSE
> job_is_first_task FALSE
> urgency_slots     min
>
> The most annoying part of all this is that this behaviour is not 
> consistent: This morning I've been able to run a 6-threaded job 
> requesting 10Gb of memory in a 48Gb node. But, in the afternoon, the 
> same job using the very same command in the same node could not be run.
>
> Does anyone have any suggestion on how to deal with this?
>
> Thanks in advance,
>
> Txema
>
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users



More information about the users mailing list