[gridengine users] SGE PE scheduler problem, doesn't pick least used nodes ?
Reuti
reuti at staff.uni-marburg.de
Wed Mar 16 12:24:43 UTC 2011
Am 16.03.2011 um 13:18 schrieb Erik Soyez:
> Well, that's probably true, "exclusive" resources are not the best choice.
> But the concept could work though if you defined that resource as ordinary
> "consumable", e.g.:
>
> Complex definition:
> ------------------------------------------------------------------------
> exclusive excl INT <= YES YES 0 0
> ------------------------------------------------------------------------
>
> Exec host definition (each host):
> ------------------------------------------------------------------------
> complex_values exclusive=1
> ------------------------------------------------------------------------
>
> sge request (e.g. Nx6-CPU-Jobs):
> ------------------------------------------------------------------------
> --soft -l exclusive=0.1665
> ------------------------------------------------------------------------
Exactly this soft consumable is the main problem:
Unable to run job: denied: soft requests on consumables like "exclusive" are not supported.
There was a discussion on the former mailing list, how to change this behavior.
-- Reuti
>
> Erik Soyez.
>
>
> On Wed, 16 Mar 2011, Reuti wrote:
>
>> Am 16.03.2011 um 12:28 schrieb Erik Soyez:
>>
>>> Good day Alex,
>>>
>>> you could try implementing an "exclusive" ressource and request it with
>>> "--soft", e.g. "--soft -l exclusive" in sge_request file as default.
>>
>> Won't this block the nodes completely? As soon as one job is occupying
>> 6 slots, the second job can't start as the "-soft -l exclusive" can't be
>> revoked in the future again, once the soft request was granted. I think
>> this is the main reason why soft consumables are denied, as the intended
>> behavior is not really clear (this could be changed, that granted soft
>> requests are handled as hard requests lateron).
>>
>>
>>> I have never tried this combination but have a look at "man complex",
>>> it's just an idea.... Erik Soyez.
>>>
>>>
>>> On Wed, 16 Mar 2011, Alex Phillips wrote:
>>>
>>>> Dear List,
>>>> We have a cluster of 1920 cores spread over 160 nodes (12 cores/node), we only run one code in one queue, with jobs of between 48 and 256 cores using an mpi pe.
>>>> When benchmarking our code we found a 14-15% speedup by running on 6 cores/node, compared with 12 cores/node.
>>>> We also found that if we ran on 6 cores/node, with a second job on the other 6cores/node, we still have a 5-6% speedup.
>>>> So I have configured our mpi pe with allocation_rule = 6, and this works, however, as the cluster fills up, the scheduler is starting a second job on some nodes, before all the nodes are busy.
>>>> How can we configure the scheduler to run one job on all the nodes, before starting a second job ?
>>>> I have tried defining the number of slots as a complex value on the execution hosts, I?ve tried ?np_load_avg, np_load_avg, slots, and -slots as the load_formula, but I can?t get it to work.
>>>> I?ve read _http://blogs.sun.com/sgrell/entry/grid_engine_scheduler_hacks_least_ but I can?t set the allocation rule to $pe_slots, as we only want to run on 6 cores/node, not 12.
>>>> Any suggestions ?
>
>
> --
>
>
>
> --
> Vorstand/Board of Management:
> Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Michel Lepert
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196
>
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users
More information about the users
mailing list