[gridengine users] Wrong scheduling behaviour with parallel jobs

Dave Love d.love at liverpool.ac.uk
Tue Mar 1 16:54:11 UTC 2011


Andreas Haupt <andreas.haupt at desy.de> writes:

>> There had previously been reservations for the two largest, highest
>> priority jobs in the system which had gone when I looked again and they
>> have just reappeared on a restart.
>
> How did you verify this? Via the scheduler logfile enabled with
> "MONITOR=1" in the scheduler configuration?

Yes (using Mark Dixon's scripts to process it).

> I my case restarting the master obviously doesn't help. Jobs still get
> reservations on one single PE only, if one uses wildcards in the PE
> name. Maybe this is even another problem than yours ...

Right.  We can only debug it somehow.

> PS: where should I put my debugging information best so that they can be
> used to finally solve this bug?

Good question that probably only the Univa devs can answer immediately.
I'll see if I can get any help.


More information about the users mailing list