[gridengine users] SGE-6.2u5: "execd went down during job start"

Erik Soyez E.Soyez at science-computing.de
Wed May 25 11:07:00 UTC 2011


Thanks Reuti and Dave,

"issue 3296" looks different to me as we don't have any "~" as parameters
nor do we get the described error message.

System details:
------------------------------------------------------------------------
Linux XXXXXXXXX 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux
------------------------------------------------------------------------
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 1
------------------------------------------------------------------------
Intel Xeon x5680 3.33GHz
------------------------------------------------------------------------

Submit details (normal submission with qsub):
------------------------------------------------------------------------
#$ -S /bin/sh
#$ -N XXXXXXXXX
#$ -A "XXXXXXXXX"
#$ -M XXXXXXXXX
#$ -q standard
#$ -m as
#$ -pe openmpi* 24-32
#$ -l lcfx_p=1,lcfx_s=0
#$ -p -100
#$ -o SGE.out
#$ -e SGE.err
#$ -cwd
------------------------------------------------------------------------

Many thanks, Erik Soyez.

@Dave: All the best for your wrist, the typing looks fine to me....


On Wed, 25 May 2011, Dave Love wrote:

> Reuti <reuti at staff.uni-marburg.de> writes:
>
>> Am 24.05.2011 um 19:43 schrieb Erik Soyez:
>>
>>> does anyone else experience dying sge_execds during job starts?
>>> We get error messages "execd went down during job start" but cannot
>>> find any reason.  Little test jobs usually do fine and if the execd
>>> gets restarted manually it seems to keep running, too.  Ideas?
>>
>> Yeah, there was an issue about it which was triggered under certain
>> circumstances:
>>
>> http://arc.liv.ac.uk/pipermail/gridengine-users/2010-December.txt
>> (search for issue 3296)
>
> That should be searchable.  I've lost access to the site, so I can't do
> anything about it.
>
> There are other known/fixed issues, depending on the platform -- what is
> it?
>
>> @Dave: up to which number did you transfer the Issuezilla to Son of
>> GridEngine?
>
> I thought I had them all (that were still open), but it looks as if I
> missed that one (the last?).  I'll see if I can get it.
>
>> Normal submission or DRMAA - which options were given to `qsub`?
>
> -- 
> Excuse the typping -- I have a broken wrist

--


-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Roland Niemeier, 
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196 





More information about the users mailing list