[gridengine users] SGE-6.2u5: "execd went down during job start"

Reuti reuti at staff.uni-marburg.de
Wed May 25 15:45:35 UTC 2011


Am 25.05.2011 um 13:07 schrieb Erik Soyez:

> Thanks Reuti and Dave,
>
> "issue 3296" looks different to me as we don't have any "~" as  
> parameters
> nor do we get the described error message.
>
> System details:
> ------------------------------------------------------------------------
> Linux XXXXXXXXX 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20  
> +0200 x86_64 x86_64 x86_64 GNU/Linux
> ------------------------------------------------------------------------
> SUSE Linux Enterprise Server 11 (x86_64)
> VERSION = 11
> PATCHLEVEL = 1
> ------------------------------------------------------------------------
> Intel Xeon x5680 3.33GHz
> ------------------------------------------------------------------------
>
> Submit details (normal submission with qsub):
> ------------------------------------------------------------------------
> #$ -S /bin/sh
> #$ -N XXXXXXXXX
> #$ -A "XXXXXXXXX"

The -N and -A flags have nothing special like characters with accent?

-- Reuti


> #$ -M XXXXXXXXX
> #$ -q standard
> #$ -m as
> #$ -pe openmpi* 24-32
> #$ -l lcfx_p=1,lcfx_s=0
> #$ -p -100
> #$ -o SGE.out
> #$ -e SGE.err
> #$ -cwd
> ------------------------------------------------------------------------
>
> Many thanks, Erik Soyez.
>
> @Dave: All the best for your wrist, the typing looks fine to me....
>
>
> On Wed, 25 May 2011, Dave Love wrote:
>
>> Reuti <reuti at staff.uni-marburg.de> writes:
>>
>>> Am 24.05.2011 um 19:43 schrieb Erik Soyez:
>>>
>>>> does anyone else experience dying sge_execds during job starts?
>>>> We get error messages "execd went down during job start" but cannot
>>>> find any reason.  Little test jobs usually do fine and if the execd
>>>> gets restarted manually it seems to keep running, too.  Ideas?
>>>
>>> Yeah, there was an issue about it which was triggered under certain
>>> circumstances:
>>>
>>> http://arc.liv.ac.uk/pipermail/gridengine-users/2010-December.txt
>>> (search for issue 3296)
>>
>> That should be searchable.  I've lost access to the site, so I  
>> can't do
>> anything about it.
>>
>> There are other known/fixed issues, depending on the platform --  
>> what is
>> it?
>>
>>> @Dave: up to which number did you transfer the Issuezilla to Son of
>>> GridEngine?
>>
>> I thought I had them all (that were still open), but it looks as if I
>> missed that one (the last?).  I'll see if I can get it.
>>
>>> Normal submission or DRMAA - which options were given to `qsub`?
>>
>> -- 
>> Excuse the typping -- I have a broken wrist
>
> --
>
>
> -- 
> Vorstand/Board of Management:
> Dr. Bernd Finkbeiner, Dr. Roland Niemeier, Dr. Arno Steitz, Dr.  
> Ingrid Zech
> Vorsitzender des Aufsichtsrats/
> Chairman of the Supervisory Board:
> Philippe Miltin
> Sitz/Registered Office: Tuebingen
> Registergericht/Registration Court: Stuttgart
> Registernummer/Commercial Register No.: HRB 382196
>
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users




More information about the users mailing list