[gridengine users] "Packing" jobs on nodes

James Gladden gladden at chem.washington.edu
Thu Jun 2 20:33:47 UTC 2011


Dave,

Both of my clusters running SGE have a consumable complex named 
"slots".  Since I did not create these I assumed that they were 
"predefined resource attributes."  Since both clusters were built with 
Rocks, I suppose it is possible that the "slots" complex was in fact 
created by helpful Rocks installation scripts.  What I did need to do 
manually was assign a value to "slots" for each execution host, so it is 
certainly true that there is no predefined relationship between slots 
and processors.

Are you saying that the "slots" complex did not exist by default on your 
systems, or rather that it existed but no value was assigned on a per 
host basis?

Jim

On 6/2/2011 9:59 AM, Dave Love wrote:
> James Gladden<gladden at chem.washington.edu>  writes:
>
>> I have no such explicit resource quota configuration (as describe below)
>> on my systems, yet no host slot over-subscription occurs.  I just tried
>> the experiment on an 8 slot node with two queues assigned.  When I
>> submit jobs to the specific queue instances associated with that
>> execution host, the first 8 jobs I submit get dispatched to the node
>> while the remainder (from either of the two queues) wait in the "qw" state.
>>
>> The only thing I have done to facilitate this behavior is to set the
>> value of the consumable "slots" resource for each execution host to 8
>> (which happens to be the number of cores on each execution host).
>> Presumably if I had wanted to allow over-subscription, or utilize
>> hyper-threading, I could set the value to something larger.
>>
>> My conclusion is that enforcing the "slots" resource limit on hosts is
>> the default behavior for SGE.  Has anyone actually observed different
>> behavior?
> Yes, with overlapping, non-mutually-suspended queue definitions.
>
> I don't understand `default behaviour' as there's no default slots host
> resource, and no necessary relationship between slots ans processors.
> On the other hand, the rqs for num_proc exactly expresses what I
> understand by not over-subscribing hosts.
>



More information about the users mailing list