[gridengine users] Wrong scheduling behaviour with parallel jobs
Dave Love
d.love at liverpool.ac.uk
Tue Mar 1 16:54:11 UTC 2011
Andreas Haupt <andreas.haupt at desy.de> writes:
>> There had previously been reservations for the two largest, highest
>> priority jobs in the system which had gone when I looked again and they
>> have just reappeared on a restart.
>
> How did you verify this? Via the scheduler logfile enabled with
> "MONITOR=1" in the scheduler configuration?
Yes (using Mark Dixon's scripts to process it).
> I my case restarting the master obviously doesn't help. Jobs still get
> reservations on one single PE only, if one uses wildcards in the PE
> name. Maybe this is even another problem than yours ...
Right. We can only debug it somehow.
> PS: where should I put my debugging information best so that they can be
> used to finally solve this bug?
Good question that probably only the Univa devs can answer immediately.
I'll see if I can get any help.
More information about the users
mailing list