[gridengine users] master node selection and $fill_up behaviour revisited
Michael Weiser
M.Weiser at science-computing.de
Tue Jun 28 13:10:03 UTC 2011
Hello,
in July 2010 I asked on the users mailing list back at SunSource about a
peculiar regression in master node selection behaviour of SGE 6.2u5.
(see http://markmail.org/message/svuskq5qc6oe3axv) After some discussion
Andy pointed out that I was most likely hitting IZ 3148 which was fixed
in 6.2u6. And indeed, I was not able to trigger the bug in 6.2u6, which
was worst of all, because I couldn't upgrade.
Today I've tried a recent build of V800_BRANCH of
https://github.com/gridengine/gridengine.git and was able to reproduce
the bug just as with SGE 6.2u5.
Does anyone here have a handle on the issue and can help out in tracking
it down and fixing it?
Does perhaps one of the other forks fix the bug?
In short, after some jobs have been run on an empty cluster, the
scheduler will start distributing say a two-slot $pe_fillup job over two
nodes even though one of them could have accomodated the whole job. An
example:
weiser at laudrup ~ $ qhost -j
HOSTNAME ARCH NCPU NSOC NCOR NTHR LOAD MEMTOT MEMUSE SWAPTO SWAPUS
----------------------------------------------------------------------------------------------
global - - - - - - - - - -
kempes lx-amd64 4 0 0 0 0.00 31.4G 249.7M 33.4G 0.0
job-ID prior name user state submit/start at queue master ja-task-ID
----------------------------------------------------------------------------------------------
17 0.51000 STDIN weiser r 06/28/2011 13:52:53 normal at kem MASTER
26 0.61000 STDIN weiser r 06/28/2011 13:56:53 normal at kem MASTER
laudrup lx-amd64 2 0 0 0 0.04 7.7G 654.5M 1.9G 224.0K
26 0.61000 STDIN weiser r 06/28/2011 13:56:53 normal at lau SLAVE
maradonna lx-amd64 4 0 0 0 0.00 31.4G 354.4M 33.4G 0.0
Thanks in advance,
--
Michael Weiser science + computing ag
Senior Systems Engineer Geschaeftsstelle Duesseldorf
Martinstrasse 47-55, Haus A
phone: +49 211 302 708 32 D-40223 Duesseldorf
fax: +49 211 302 708 50 www.science-computing.de
--
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier,
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
More information about the users
mailing list